’ * ' 
| ‘ f . . 
: ie 


WILEY 


Introduction to 
Abstract Algebra 


Introduction to 
Abstract Algebra 


Fourth Edition 


W. Keith Nicholson 
University of Calgary 
Calgary, Alberta, Canada 


) WILEY 


AJOHN WILEY & SONS, INC., PUBLICATION 


Copyright 2012 by John Wiley & Sons, Inc. All rights reserved. 


Published by John Wiley & Sons, Inc., Hoboken, New Jersey. 
Published simultaneously in Canada. 


No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any 
form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, 
except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without 
either the prior written permission of the Publisher, or authorization through payment of the 
appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, 
MA 01923, (978) 750-8400, fax (978) 646-8600, or on the web at www.copyright.com. Requests 
to the Publisher for permission should be addressed to the Permissions Department, John Wiley 
& Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008. 


Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best 
efforts in preparing this book, they make no representations or warranties with respect to the 
accuracy or completeness of the contents of this book and specifically disclaim any implied 
warranties of merchantability or fitness for a particular purpose. No warranty may be created ore 
extended by sales representatives or written sales materials. The advice and strategies contained 
herin may not be suitable for your situation. You should consult with a professional where 
appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other 
commercial damages, including but not limited to special, incidental, consequential, or other 
damages. 


For general information on our other products and services please contact our Customer Care 
Department with the U.S. at 877-762-2974, outside the U.S. at 317-572-3993 or fax 317-572-4002. 


Wiley also publishes its books in a variety of electronic formats. Some content that appears in 
print, however, may not be available in electronic format. 


Library of Congress Cataloging-in-Publication Data: 


Nicholson, W. Keith. 
Introduction to abstract algebra / W. Keith Nicholson. — 4th ed. 
p. cm. 

Includes bibliographical references and index. 
ISBN 978-1-118-13535-8 (cloth) 

1. Algebra, Abstract. I. Title. 
QA162.N53 2012 
512’.02-de23 2011031416 


Printed in the United States of America. 
10987654321 


Contents 


PREFACE 

ACKNOWLEDGMENTS 

NOTATION USED IN THE TEXT 

A SKETCH OF THE HISTORY OF ALGEBRA TO 1929 


O Preliminaries 


1 


2 


0.1 
0.2 
0.3 
0.4 


Proofs / 1 

Sets / 5 
Mappings / 9 
Equivalences / 17 


Integers and Permutations 


1.1 
1.2 
1.3 
1.4 
1.5 


Induction / 24 

Divisors and Prime Factorization / 32 
Integers Modulo n / 42 

Permutations / 53 

An Application to Cryptography / 67 


Groups 


2.1 
2.2 
2.3 
2.4 


Binary Operations / 70 

Groups / 76 

Subgroups / 86 

Cyclic Groups and the Order of an Element / 90 


23 


69 


vi 


Contents 


2.5 Homomorphisms and Isomorphisms / 99 

2.6 Cosets and Lagrange’s Theorem / 108 

2.7 Groups of Motions and Symmetries / 117 

2.8 Normal Subgroups / 122 

2.9 Factor Groups / 131 

2.10 The Isomorphism Theorem / 137 

2.11 An Application to Binary Linear Codes / 143 


Rings 

3.1 Examples and Basic Properties / 160 
3.2 Integral Domains and Fields / 171 
3.3 Ideals and Factor Rings / 180 

3.4 Homomorphisms / 189 

3.5 Ordered Integral Domains / 199 


Polynomials 


4.1 Polynomials / 203 

4.2 Factorization of Polynomials Over a Field / 214 
4.3 Factor Rings of Polynomials Over a Field / 227 
4.4 Partial Fractions / 236 

4.5 Symmetric Polynomials / 239 

4.6 Formal Construction of Polynomials / 248 


Factorization in Integral Domains 


5.1 Irreducibles and Unique Factorization / 252 
5.2 Principal Ideal Domains / 264 


Fields 


6.1 Vector Spaces / 275 

6.2 Algebraic Extensions / 283 

6.3 Splitting Fields / 291 

6.4 Finite Fields / 298 

6.5 Geometric Constructions / 304 

6.6 The Fundamental Theorem of Algebra / 308 
6.7 An Application to Cyclic and BCH Codes / 310 


Modules over Principal Ideal Domains 


7.1 Modules / 324 
7.2 Modules Over a PID / 335 


159 


202 


251 


274 


324 


Contents vii 


8 p-Groups and the Sylow Theorems 349 


8.1 Products and Factors / 350 

8.2 Cauchy’s Theorem / 357 

8.3 Group Actions / 364 

8.4 The Sylow Theorems / 371 

8.5 Semidirect Products / 379 

8.6 An Application to Combinatorics / 382 


9 Series of Subgroups 388 


9.1 The Jordan—Hélder Theorem / 389 
9.2 Solvable Groups / 395 
9.3 Nilpotent Groups / 401 


10 Galois Theory 412 


10.1 Galois Groups and Separability / 413 

10.2 The Main Theorem of Galois Theory / 422 

10.3 Insolvability of Polynomials / 434 

10.4 Cyclotomic Polynomials and Wedderburn’s Theorem / 442 


11 Finiteness Conditions for Rings and Modules 4A7 


11.1 Wedderburn’s Theorem / 448 
11.2 The Wedderburn—Artin Theorem / 457 


Appendices A71 


Appendix A Complex Numbers / 471 

Appendix B Matrix Algebra / 478 

Appendix C Zorn’s Lemma / 486 

Appendix D Proof of the Recursion Theorem / 490 


BIBLIOGRAPHY 492 
SELECTED ANSWERS 495 
INDEX 523 


Preface 


This book is a self-contained introduction to the basic structures of abstract algebra: 
groups, rings, and fields. It is designed to be used in a two-semester course for 
undergraduates or a one-semester course for seniors or graduates. The table of 
contents is flexible Gee the chapter summaries that follow), so the book is suitable for 
a traditional course at various levels or for a more application-oriented treatment. 
The book is written to be read by students with little outside help and so can be 
used for self-study. In addition, it contains several optional sections on special topics 
and applications. 


Because many students will not have had much experience with abstract thinking, a 
number of important concrete examples (number theory, integers modulo n, permu- 
tations) are introduced at the beginning and referred to throughout the book. 
These examples are chosen for their importance and intrinsic interest and also be- 
cause the student can do actual computations almost immediately even though the 
examples are, in the student’s view, quite abstract. Thus, they provide a bridge 
to the abstract theory and serve as prototype examples of the abstract structures 
themselves. As an illustration, the student will encounter composition and inverses 
of permutations before having to fit these notions into the general framework of 
group theory. 


The axiomatic development of these structures is also emphasized. Algebra provides 
one of the best illustrations of the power of abstraction to strip concrete examples 
of nonessential aspects and so to reveal similarities between ostensibly different 
objects and to suggest that a theorem about one structure may have an analogue 
for a different structure. Achieving this sort of facility with abstraction is one of the 
goals of the book. This goes hand in hand with another goal: to teach the student 
how to do proofs. The proofs of most theorems are at least as important for the 
techniques as for the theorems themselves. Hence, whenever possible, techniques 
are introduced in examples before giving them in the general case as a proof. This 
partly explains the large number of examples (over 450) in the book. 


x Preface 


Of course, a generous supply of exercises is essential if this subject is to have a 
lasting impact on students, and the book contains more than 1450 exercises (many 
with separate parts). For the most part, computational exercises appear first, and 
the exercises are given in ascending order of difficulty. Hints are given for the less 
straightforward problems, and answers are provided to odd numbered (parts of) 
computational exercises and to selected theoretical exercises. (A student solution 
manual is available.) While exercises are vital to understanding this subject, they 
are not used to develop results needed later in the text. 


An increasing number of students of abstract algebra come from outside mathe- 
matics and, for many of them, the lure of pure abstraction is not as strong as for 
mathematicians. Therefore, applications of the theory are included that make the 
subject more meaningful and lively for these students (and for the mathematicians!). 
These include cryptography, linear codes, cyclic and BCH codes, and combinatorics, 
as well as “theoretical” applications within mathematics, such as the impossibility 
of the classical geometric constructions. Moreover, the inclusion of short historical 
notes and biographies should help the reader put the subject into perspective. In 
the same spirit, some classical “gems” appear in optional sections (one example is 
the elegant proof of the fundamental theorem of algebra in Section 6.6, using the 
structure theorem for symmetric polynomials). In addition, the modern flavor of 
the subject is conveyed by mentioning some unsolved problems and recent achieve- 
ments, and by occasionally stating more advanced theorems that extend beyond 
the results in the book. 


Apart from that the material is quite standard. The aim is to reveal the basic 
facts about groups, rings, and fields and give the student the working tools for 
applications and further study. The level of exposition rises slowly throughout the 
book and no prior knowledge of abstract algebra is required. Even linear algebra is 
not needed. Except for a few well-marked instances, the aspects of linear algebra 
that are needed are developed in the text. Calculus is completely unnecessary. Some 
preliminary topics that are needed are covered in Chapter 0, with appendices on 
complex numbers and matrix algebra (over a commutative ring). 


Although the chapters are necessarily arranged in a linear order, this is by no 
means true of the contents, and the student (as well as the instructor) should keep 
the chapter dependency diagram in mind. A glance at that diagram shows that 
while Chapters 1-4 are the core of the book, there is enough flexibility in the 
remaining chapters to accommodate instructors who want to create a wide variety 
of courses. The jump from Chapter 6 to Chapter 10 deserves mention. The student 
has a choice at the end of Chapter 6: either change the subject and return to 
group theory or continue with fields in Chapter 10 (solvable groups are adequately 
reviewed in Section 10.3, so Chapter 9 is not necessary). The chapter summaries 
that follow, and the chapter dependency diagram, can assist in the preparation of 
a course syllabus. 


Our introductory course at Calgary of 36 lectures touches Sections 0.3 and 
0.4 lightly and then covers Chapters 1-4 except for Sections 1.5, 2.11, 3.5, and 
4,4-4,6. The sequel course (also 36 lectures) covers Chapters 5, 6, 10, 7, 8, and 9, 
omitting Sections 6.6, 6.7, 8.5, 8.6, and 10.4 and Chapter 11. 


Preface xi 


FEATURES 


This book offers the following significant features: 


Self-contained treatment, so the book is suitable for self-study. 
Preliminary material for self-study or review available in Chapter 0 and in 
Appendices A and B. 


Elementary number theory, integers modulo n, and permutations done first. as 
a bridge to abstraction. 


Over 450 worked examples to guide the student. 


Over 1450 exercises (many with parts), graded in difficulty, with selected 
answers. 

Gradual increase in level throughout the text. 

Applications to number theory, combinatorics, geometry, cryptography, cod- 
ing, and equations. 

Flexibility in syllabus construction and choice of optional topics (see chapter 
dependency diagram). 

Historical notes and biographies. 

Several special topics (for example, symmetric polynomials, nilpotent groups, 
and modules). 

Solution manual containing answers or solutions to all exercises. 

Student solution manual available with solutions to all odd numbered (parts 
of) exercises. 


CHANGES IN THE THIRD EDITION (2007) 


The important concept of a module was introduced and used in Chapters 7 and 11. 


Chapter 7 on finitely generated abelian groups was completely rewritten, mod- 
ules were introduced, direct sums were studied, and the rank of a free module 
was defined (for commutative rings). Then the structure of finitely generated 
modules over a PID was determined. 

Chapter 11 was upgraded from finite dimensional algebras to rings with the 
descending chain condition. Wedderburn’s characterization of simple artinian 
rings and the Wedderburn~—Artin theorem on semisimple rings were proved. 
A new section on semidirect products of groups was added. 

Appendices on Zorn’s lemma and the recursion theorem were added. 

More solutions to theoretical exercises were included in the Selected Answers 
section. 


CHANGES IN THE FOURTH EDITION 


The changes in the Third Edition primarily involved new concepts (modules, semi- 
direct products, etc). However, the changes in the Fourth Edition are more 
“microscopic” in nature, having more to do with clarity of exposition and making 


xii 


Preface 


the “flow” of arguments more natural and inevitable. Of course, minor editorial 
changes are made through the book to correct typographical errors, improve the 
exposition, and in some cases remove unnecessary material. Here are some more 
specific changes. 


s 


Because of the increasing importance of modules in the undergraduate curricu- 
lum, the new material on modules over a PID (Chapter 7) and the Wedder- 
burn theorems (Chapter 11) introduced in the Third Edition was thoroughly 
reviewed for clarity of exposition. 


More generally, in an effort to make the book more accessible to students, the 
writing was carefully edited to ensure readability and clarity, the goal being 
to make arguments flow naturally and, as much as possible, effortlessly. Of 
course, this is in accord with the goal of making the book more suitable for 
self-study. 

Appendix B is expanded to an exposition of matrix algebra over a commuta- 
tive ring. 

Two notational changes are introduced. First, the symbol o(g) replaces |g| for 
the order of an element g in a group, reducing confusion with the cardinality 
|X| of a set. Second, polynomials f(x) are written simply as f. 

In Chapter 2, proofs of two early examples of “structure theorems” are given 
to motivate the subject: A group of order 2p (p a prime) is cyclic or dihedral, 
and an abelian group of order p? is Cy2 or Cp X Cp. 

More emphasis is placed on characteristic subgroups and on the product HK 
of subgroups H and K. 

Wilson’s theorem is included in §1.3 with later applications to number theory 
and fields. 

In Chapter 5, it is shown that an integral domain is a UFD if and only. if it 
has the ACC on principal ideals and either (a) every irreducible is prime, or 
(b) any two nonzero elements have a greatest common divisor. This shortens 
the original proof (with (a) only) at the expense of a lemma of independent 
interest. 


In Chapter 6, a simpler proof is given that any finite multiplicative subgroup 
of a field is cyclic. 

The first section of Chapter 8 has been completely rewritten with several 
results added. 


In Chapter 9, several new results on nilpotent groups have been included. 
In particular, the Fitting subgroup of any finite group G is introduced, sev- 
eral properties are deduced, and its relationship to the Frattini subgroup is 
explained. 

In Chapter 10, many arguments are rewritten and clarified, in particular the 
lemma explaining the basic Galois connection between the subgroups of the 
Galois group of a field extension and the intermediate fields of the extension. 
In Chapter 11, a new elementary proof is given that R= L", where L is a simple 
left ideal of the simple ring R. This directly leads to Wedderburn’s theorem, 
and the proof does not involve the theory of semisimple modules. 

A student solution manual is now available giving detailed solutions to all odd 
numbered (parts of) exercises. 


Preface xiii 
CHAPTER SUMMARIES 


Chapter 0. Preliminaries. This chapter should be viewed as a primer on math- 
ematics because it consists of materials essential to any mathematics major. The 
treatment is self-contained. I personally ask students to read Sections 0.1 and 0.2, 
and I touch briefly on the highlights of Sections 0.3 and 0.4. (Our students have 
had complex numbers and one semester of linear algebra, so a review of Appendices 
A and B is left to them.) 


Chapter 1. Integers and Permutations. This chapter covers the fundamental 
properties of the integers and the two prototype examples of rings and groups: the 
integers modulo n and the permutation group S,,. These are presented naively and 
allow the students to begin doing ring and group calculations in a concrete setting. 


Chapter 2. Groups. Here, the basic facts of group theory are developed, includ- 
ing cyclic groups, Lagrange’s theorem, normal subgroups, factor groups, homomor- 
phisms, and the isomorphism theorem. The simplicity of the alternating groups Ay, 
is established for n > 5. An optional application to binary linear codes in included. 


Chapter 3. Rings. The basic properties of rings are developed: integral domains, 
characteristic, rings of quotients, ideals, factor rings, homomorphisms and the iso- 
morphism theorem. Simple rings are studied, and it is shown that the ring of n x n 
matrices over a division ring is simple. 


Chapter 4. Polynomials. After the usual elementary facts are developed, 
irreducible polynomials are discussed and the unique factorization of polynomials 
over a field is proved. The factor rings of polynomials over a field are described 
in detail, and some finite fields are constructed. In an optional section, symmetric 
polynomials are discussed and the fundamental structure theorem is proved. 


Chapter 5. Factorization in Integral Domains. Unique factorization domains 
(UFDs) are characterized in terms of irreducibles, primes, and greatest common 
divisors. The fact that being a UFD is inherited by polynomial rings is derived. 
Principal ideal domains and euclidean domains are discussed. This chapter is self- 
contained, and the material presented is not required elsewhere. 


Chapter 6. Fields. After a minimal amount of vector space theory is developed, 
splitting fields are constructed and used to completely describe finite fields. This 
topic is a direct continuation of Section 4.3. In optional sections, the classical re- 
sults on geometric constructions are derived, the fundamental theorem of algebra 
is proved, and the theory of cyclic and BCH codes is developed. 


Chapter 7. Modules over Principal Ideal Domains. Motivated by vector 
spaces (Section 6.1) and abelian groups, the idea of a module over a ring is in- 
troduced. Free modules are discussed and the uniqueness of the rank is proved 
for IBN rings. With abelian groups as the motivating example, the structure of 
finitely generated modules over a principal ideal domain is determined, yielding the 
fundamental theorem for finitely generated abelian groups. 


Chapter 8. p-Groups and the Sylow Theorems. This chapter is a direct con- 
tinuation of Section 2.10. After some preliminaries (including the correspondence 
theorem), the class equation is developed and used to prove Cauchy’s theorem 
and to derive the basic properties of p-groups. Then group actions are introduced, 
motivated by the class equation and an extended Cayley theorem, and used to prove 


xiv Preface 


the Sylow theorems. Semidirect products are presented. An optional application to 
combinatorics is also included. 


Chapter 9. Series of Subgroups. The chapter begins with composition series 
and the Jordan—Hélder theorem. Then solvable series are introduced, including the 
derived series, and the basic properties of solvable groups are developed. Sections 
9.1 and 9.2 depend only on the second and third isomorphism theorems and the 
correspondence theorem in Section 8.1. Finally, in Section 9.3, central series are 
discussed and nilpotent groups are characterized as direct products of p-groups, 
and the Frattini and Fitting subgroups are introduced. 


Chapter 10. Galois Theory. Galois groups of field extensions are defined, sep- 
arable elements are introduced, and the main theorem of Galois theory is proved. 
Then it is shown that polynomials of degree 5 or more are not solvable in radicals. 
This requires only Chapter 6 (the reference to solvable groups in Section 10.3 is 
adequately reviewed there). Finally, cyclotomic polynomials are discussed and used 
(with the class equation) to prove Wedderburn’s theorem that every finite division 
ring is a field. 


Chapter 11. Finiteness Conditions for Rings and Modules. The ascend- 
ing and descending chain conditions on a module are introduced and the Jordan— 
Holder theorem is proved. Then endomorphism rings are used to prove Wedder- 
burn’s theorem that a simple, left artinian ring is a matrix ring over a division 
ring. Next, semisimple modules are studied and the results are employed to prove 
the Wedderburn—Artin theorem that a semisimple ring is a finite product of matrix 
rings over division rings. In addition, it is shown that these semisimple rings are 
characterized as the rings with every module projective and as the semiprime, left 
artinian rings. 


Preface XV 


Chapter Dependency Diagram 


1 Integers and 
Permutations 


7 Modules over 
PID's 


4 Polynomiats _—_— 5 Factorization 


8 p-Groups and the 
Sylow Theorem 


6Fields = §------------ 


9 Series of Por 11 Finiteness 
Subgroups 10 Galois Theory Conditions 


A Dashed arrow indicates minor dependency. 


ee) 


Acknowledgments 


I express my appreciation to the following people for their useful comments and 
suggestions for the first edition of the book: F. Doyle Alexander, Stephen F. Austin 
State University; Steve Benson, Saint Olaf College; Paul M. Cook II, Furman Uni- 
versity; Ronald H. Dalla, Eastern Washington University; Robert Fakler, University 
of Michigan—Dearborn; Robert M. Guralnick, University of Southern California; 
Edward K. Hinson, University of New Hampshire; Ron Hirschorn, Queen’s Uni- 
versity; David L. Johnson, Lehigh University; William R. Nico, California State 
University-Hayward; Kimmo I. Rosenthal, Union College; Erik Shreiner (deceased), 
Western Michigan University; 5. Thomeier, Memorial University; and Marie A. 
Vitulli, University of Oregon. 


I also want to thank all the readers who informed me about typographical and 
other minor errors in the third edition. Particular thanks go to: 


Carl Faith, Rutgers University, for giving the book a careful study and making 
many very useful suggestions, too numerous to list here; 


David French, Derbyshire, UK, for pointing out several typographical errors; 
Michel Racine, Université d’Ottawa, for pointing out a mistake in an exercise 
deducing the commutativity of addition in a ring from the other axioms; 


Yoji Yoshii, Université d’Ottawa, for revealing two errors in the exercises 
for Chapter 5; 


Yiqiang Zhou, Memorial University of Newfoundland, for many helpful 
suggestions and comments. 
For the fourth edition, special thanks go to: 


Jerome Lefebvre, University of Ottawa, for pointing out several typographical 
errors; 


xvii 


xviii Acknowledgments 


Edgar Goodaire and his students, Memorial University, for finding dozens of 
typographical errors and making many useful suggestions; 


Keith Conrad, University of Connecticut, for many useful comments on the 
exposition; 

Nazih Nahlus, American University of Beirut, for the proof that a finite 
multiplicative group of a field is cyclic; 


Matthew Greenberg, University of Calgary, for pointing out that Burnside’s 
lemma, on Counting Orbits was due to Cauchy and Frobenius. 


Milosz Kosmider, student, for correcting an error in Chapter 0; 


Yannis Avrithis, National Technical University of Athens, for pointing out 
dozens of typographical errors and making several suggestions. 


It is a pleasure to thank Steve Quigley for his generous assistance throughout the 
project. Thanks also go to the production staff at Wiley and particularly to Susanne 
Steitz-Filler for keeping the project on schedule and responding so quickly to all 
my questions. I also want to thank Joanne Canape for her vital assistance with the 
computer aspects of the project. 


Finally, I want to thank my wife, Kathleen, for her unfailing support. Without her 
understanding and cooperation during the many hours that I was absorbed with 
this project, this book would not exist. 


Notation used in the Text 


wh 
<< 
3 
ion 
2 


NaReNzninag | 
© 
es 
a 
+ 


De Ss 


ANB 


Description 


implication 

logical equivalence 

set membership 

set containment 

proper set containment 

set of natural numbers 

set of integers 

set of rational numbers 

set of real numbers 

set of complex numbers 
positive elements in these sets 
empty set 

union of sets 

intersection of sets 

difference set 

ordered pair 

cartesian product of sets A and B 
ordered n-tuple 


mapping a from A to B 


image of x under mapping a 
image of mapping a 

number of elements in set A 
composite of mappings a and 8 
identity mapping on set A 
inverse of mapping a 
equivalence relation 
equivalence class of a 


First Used 


DWDONNNNWOWWO CoC oto ow NY 


10 


xix 


Notation used in the Text 


Symbol 


Az 

nl 

(") 

din 
ged(m, n) 
gced(n1,...,r) 
lem(m, n) 
lemi(ny,.2214 Ti) 
a = b (modn) 


Description 


quotient set of equivalence = 
n factorial 

binomial coefficient 

d is a divisor of n 


greatest common divisor 


least common multiple 


congruence modulo n 
residue class of an integer a 
integers modulo n 
symmetric group of degree n 


permutation o in Sp, 


identity permutation in S, 
cycle permutation in S;, 
alternating group of degree n 
sign of permutation o 

nth power of a 

inverse of a 

circle group 

group of nth roots of unity 
group of units of monoid M 
group of permutations of set X 
general linear group over R 
cyclic group of order n 

Klein 4-group 

special linear group over R 
projective special linear group over F' 


_ center of group G 


cyclic subgroup generated by g 
order of group element g 
subgroup generated by X 
automorphism group of G 
inner automorphism group of G 
right, left cosets of subgroup H 
index of subgroup H in G 
dihedral group 

H is a normal subgroup of G 
quaternion group 

factor group of G by K 

derived (commutator) subgroup of G 
kernel of a homomorphism a 
set of binary n-tuples 

ring of functions X — R 

ring of n x n matrices over R 
characteristic of a ring R 


First Used 


Symbol 


Z(t) 
T2(R) 
Z(R) 
Rep 
+i 
ann(a) 
Ri 
[sr] 
deg f 
®, (x) 
a~wb 
span{v1,..., Un} 
dim V 


length G 

at 

4(G), Ts(G) 
0(G) 


HY 

Sp(@1, £2, tae Zn) 
Zio 

hom(M, N) 

end (M) 

A(K) 

rez, imz 

z, |z| 

e 


Notation used in the Text 


Description 


ring of gaussian integers 

upper triangular matrices over R 
center of a ring R 

oppositeving 

quaternions 

annihilator of element a 

ring extension of a general ring R 
ring of polynomials in x over R 
degree of polynomial f 
cyclotomic polynomials 
associates in an integral domain 
space spanned by v,...,Un 
dimension of vector space V 
dimension of F over a subfield F 
field generated over F' by uy,..., Un 
field of algebraic numbers 

formal derivative of f 

Galois field of order p” 

direct sum of modules 

direct sum of n copies of module M 
rank of free module M 

torsion submodule of M@ 
p-primary component of M 
conjugacy class of a 

normalizer of a subgroup X 

core of a subgroup H 

orbit of x generated by G 
stabilizer of x 

number of Sylow p-subgroups 
semidirect product of K by H 
composition length of G 

higher derived subgroups for G 
central series for group G 
Frattini subgroup of G 

Fitting subgroup of G 

Galois group of & over F 
automorphisms fixing subfield 
elements fixed by subgroup H 
elementary symmetric polynomials 
The Priifer group for a prime p 
group of module homomorphisms 
endomorphism ring of module M 
homogeneous component 

real, imaginary, part of z 
conjugate, absolute value of z 
notation for cos @ + isin @ 


First Used 


164 
164 
165 
169 
174 
182 
194 
203 
205 
221 
253 
277 
279 
283 
284 
289 
299 
300 
325, 329 
325 
333 
334, 336 
337 
358 
359 
364 
367 
368 
374 
380 
390 
396 
402, 403 
406 
408 
413 
425 
425 
439 
449 
452 
452 
461 
472 
473 
374 


xxi 


A Sketch of the History of 
Algebra to 1929 


2500 Bc Hieroglyphic numerals used in Egypt. 
2400 Bc Babylonians begin positional algebraic notation. 
600 Bc Pythagoreans discuss prime numbers. 


250 Diophantus writes Arithmetica, using notation from which modern notation 
evolved, and insists on exact solutions of equations in integers. 


830 al-Khowarizmi writes Al-jabr, a textbook giving rules for solving linear and 
quadratic equations. 


1202 Leonardo of Pisa writes Liber abaci on arithmetic and algebraic equations. 


1545 Tartaglia solves the cubic, and Cardano publishes the result in his Ars Magna. 
Imaginary numbers are suggested. 


1580 Viéte uses vowels to represent unknown quantities, with consonants for 
constants. 


1629 Fermat becomes the founder of the modern theory of numbers. 
1636 Fermat and Descartes invent analytic geometry, using algebra in geometry. 
1749 Euler formulates the fundamental theorem of algebra. 


1771 Lagrange solves the general cubic and quartic by considering permutations 
of the roots. 


1799 Gauss publishes his first proof of the fundamental theorem of algebra. 
1801 Gauss publishes his Disquisitiones Arithmeticae. 

1813 Ruffini claims that the general quintic cannot be solved by radicals. 

1824 Abel proves that the general quintic cannot be solved by radicals. 

1829 Galois introduces groups of substitutions. 

1831 Galois sends his great memoir to the French Académie, but it is rejected. 
1843 Hamilton discovers the quaternions. 


xxii 


xxiv A 


1846 
1854 
1870 


1870 
1872 
1878 


1879 


1889 
1889 
1905 
1908 


1921 
1927 


1929 


Sketch of the History of Algebra to 1929 


Kummer invents his ideal numbers. 
Cayley introduces the multiplication table of a group. 


Jordan publishes his monumental Traité, which explains Galois theory, 
develops group theory, and introduces composition series. 


Kronecker proves the fundamental theorem of finite abelian groups. 
Sylow presents his results on what are now called the Sylow theorems. 


Cayley proves that every finite group can be represented as a group of 
permutations. 


Dedekind defines algebraic number fields, studies the factorization of 
algebraic integers into primes, and introduces the concept of an ideal. 


Peano formulates his axioms for the natural numbers. 
Hélder completes the proof of the Jordan—Hélder theorem. 
Wedderburn proves that finite division rings are commutative. 


Wedderburn proves his structure theorem for finite dimensional algebras with 
no nilpotent ideals. 


Noether publishes her influential paper on chain conditions in ring theory. 
Artin extends Wedderburn’s 1908 paper to rings with the descending chain 
condition. 

Noether establishes the modern approach to the theory of representations 
of finite groups. 


Chapter O 


Prelaminaries 


The science of Pure Mathematics, in its modern development, may claim to be the most 
original creation of the human spirit. 


—Alfred North Whitehead 


This brief chapter contains background material needed in the study of 
abstract algebra and introduces terms and notations used throughout the book. 
Presenting all this information at the beginning is preferable, because its introduc- 
tion at the point it is needed interrupts the continuity of the text. Moreover, we 
can include enough detail here to help those readers who may be less prepared or 
are using the book for self-study. However, much of this material may be familiar. 
If so, just glance through it quickly and begin with Chapter 1, referring to this 
chapter only when necessary. 


0.1 PROOFS 


The essential quality of a proof is to compel belief. 


~—~Pierre de Fermat 


Logic plays a basic role in human affairs. Scientists use logic to draw conclusions 
from experiments, judges use it to deduce consequences of the law, and mathemati- 
cians use it to prove theorems. Logic arises in ordinary speech with assertions such 
as “if John studies hard, he will pass the course,” or “if an integer n is divisible 
by 6, then n is divisible by 3.” In each case, the aim is to assert that if a certain 
statement is true, then another statement must also be true. In fact, if p and q 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


2 0. Preliminaries 


denote statements, most theorems take the form of an implication: “If p is true, 
then g is true.” We write this in symbols as 


p=q 


and read it as “p implies g.” Here, p is the hypothesis and q the conclusion of 
the implication. Verification that p = q is valid is the proof of the implication. 
In this section, we examine the most common methods of proof? and illustrate each 
technique with an example. 


Method of Direct Proof. To prove p => q, demonstrate directly that q is true 
whenever p is true. 


Example 1. If n is an odd integer, show that n? is odd. 


Solution. If nm is odd, it has the form n=2k+1 for some integer k. Then 
n? = 4k? +4k+1=2(2k?+2k)+1 is also odd because 2k?+2k is an integer. O 


Note that the computation n? = 4k?+4k+1 in Example 1 involves some 
simple properties of arithmetic that we did not prove. Actually, a whole body of 
mathematical information lies behind nearly every proof of any complexity, 
although this fact usually is not stated explicitly. 

Suppose that you are asked to verify that n? >0 for every integer n. This 
expression is an implication: If n is an integer, then n? > 0. To prove it, you might 
consider separately the cases that n > 0, n= 0, and n <0 and then show that 
n® > 0 in each case. (You would have to invoke the fact that 0? = 0 and that the 
product of two positive, or two negative, integers is positive.) We formulate the 
general method as follows: 


Method of Reduction to Cases. To prove p > q, show that p implies at least 
one of a list p1,p2,+-* ,Pn of statements (the cases) and that p; => q for each i. 


Example 2. If n is an integer, show that n? — n is even. 


Solution. Note that n?-—n=n(n—1) is even if n or n—1 is even. Hence, 
given mn, we consider the two cases that n is even or odd. Because n—1 is 
even in the second case, n? — n is even in either case. O 


The statements used in mathematics must be true or false. This requirement 
leads to a proof technique that can mystify beginners. The method is a formal ver- 
sion of a debating strategy whereby the debater assumes the truth of an opponent’s 
position and shows that it leads to an absurd conclusion. 


Method of Proof by Contradiction. To prove p > q, show that the assumption 
that both p is true and q is false leads to a contradiction. 


Example 3. If r is a rational number (fraction), show that r? # 2. 


Solution. To argue by contradiction, we assume that r is a rational number and 
that r? = 2 and show that this assumption leads to a contradiction. Let m and n 
be integers such that r= * is in lowest terms (so, in particular, m and n are 


both not even). Then r? = 2 gives m? = 2n?, so m? is even. This means m is even 
3 


1For a more detailed look at proof techniques, see Solow, D., How to Read and Do Proofs, 2nd 
ed., Wiley, 1990; Lucas, J.F., Introduction to Abstract Mathematics, Wadsworth, 1986, Chapter 2. 


0.1. Proofs 3 


(Example 1), say m = 2k. But then 2n? = m? = 4k?, so n? = 2k? is even, and hence 
n is even. This shows that n and m are both even, contrary to the choice of n 
and m. im 


Example 4. If 2” —1 is a prime number, show that n is a prime number. (Here, a 
prime number is an integer greater than 1 that cannot be factored as the product 
of two smaller positive integers.) 


Solution. We must show that p > q, where p is “2" — 1 is a prime” and q is “n is 
a prime.” Suppose that q is false so that n is not a prime, say n = ab, where a > 2 
and b > 2 are integers. If we write 2? = z, then 2” = 2% = (2%) = a, Hence, 


27—L=2°-1=(2-1)(21 40° %+..-42? +241). 
As x > 4, this factors 2” — 1 into smaller positive integers, a contradiction. im 
The next example exhibits one way to show that an implication is not valid. 
Example 5. Show that the implication “n is a prime = 2” — 1 is a prime” is false. 


Solution. The first few primes are n = 2,3,5,7, and the corresponding values 
2” —1=3,7,31, 127 are all prime, as the reader can verify. This observation seems 
to be evidence that the implication is true. However, the next prime is n = 11 and 
211 _ 1 = 2047 = 23 - 89 clearly is not a prime. 0 


We say that n = 11 is a counterexample to the (proposed) implication in Example 
5. Note that if you can find even one example for which an implication is not valid, 
the implication is false. Thus, disproving implications in a sense is easier than 
proving them. 

The implications in Examples 4 and 5 are closely related: They have the form 
p= q and q=>p, where p and g are statements. Each is called the converse of 
the other, and as the examples show, an implication can be valid even though its 
converse is not valid. If both p => q and q => p: are valid, the statements p and q 
are called logically equivalent, which we write in symbols as 


pq 


and read “p if and only if g.” Many of the most satisfying theorems make the asser- 
tion that two statements, ostensibly quite different, are in fact logically equivalent. 


Example 6. If n is an integer, show that “n is odd & n? is odd.” 


Solution. In Example 1, we proved the implication “n is odd => n? is odd.” Here, 
we prove the converse by contradiction. If n? is odd, we assume that n is not odd. 
Then n is even, say n = 2k, so n? = 4k? is also even, a contradiction. oO 


Many more examples of proofs can be found in this book and, although they 
are often more complex, most are based on one of these methods. In fact, abstract 
algebra is one of the best topics on which the reader can sharpen his or her skill at 
constructing proofs. Part of the reason for this is that much of abstract algebra is 
developed using the axiomatic method. That is, in the course of studying various 
examples, it is observed that they all have certain properties in common. Then when 
a general abstract system is studied in which these properties are assumed to hold 
(and are called axioms), statements (called theorems) are deduced from these 


4 0. Preliminaries 


axioms by using the methods presented in this section. These theorems will then 
be true in all the concrete examples because the axioms hold in each case. But 
this procedure is more than just an efficient method for finding theorems in 
examples. By reducing the proof to its essentials, we gain a better understanding of 
why the theorem is true and how it relates to analogous theorems in other abstract 
systems. 

The axiomatic method is not new. Euclid first used it in about 300 BC to 
derive all the propositions of (euclidean) geometry from a list of 10 axioms. The 
method lends itself well to abstract algebra. The axioms are simple and easy to 
understand, and there are only a few of them. For example, group theory contains 
a large number of theorems derived from only four simple axioms. 


Exercises 0.1 


1. In each case, prove the result and either prove the converse or give a counterexample. 
(a) If n is an even integer, then n? is a multiple of 4. 
(b) If m is an even integer and n is an odd integer, then m+n is odd. 
(c) If = 2 or g =3, then 2° — 627 + 1lz—6=0. 
(d) If? —524+6=0, then 2 = 2 or 2 =3. 
2. In each case, prove the result by splitting into cases or give a counterexample. 
(a) If n is any integer, then n? = 4k + 1 for some integer k. 
(b) If m is any odd integer, then n? = 8k + 1 for some integer k. 
(c) If n is any integer, n? — n = 3k for some integer k. (Hint: Use the fact that each 
integer has one of the forms 3k, 3k +1, or 3k + 2, where k is an integer.] 
3. In each case, prove the result by contradiction and either prove the converse or give 
a counterexample. 
(a) If n > 2 is a prime integer, then n is odd. 
(b) If n+ m = 25, where n and m are integers, one of n and m is greater than 12. 
(c) If a and b are positive numbers and a < b, then a < Vb. 
(d) If m and n are integers and mn is even, then m is even or n is even. 
4. Prove each implication by contradiction. 
(a) If x and y are positive numbers, then /a +y # /t+ V/V. 
(b) If x is irrational and y is rational, then z + y is irrational. 
(c) If 13 people are selected, at least 2 have birthdays in the same month. 
(d) Pigeonhole Principle. If n + 1 pigeons are placed in n holes, some hole contains 
at least two pigeons. 
5. Disprove each statement by giving a counterexample. 


(a) n?+n-+11 is a prime for all positive integers n. 
(b) n? > 2” for all integers n > 2. 

(c) If n points are arranged on a circle in such a way 
that no three of the lines joining them have a 
common point, these lines divide the circle into 

2-1 regions. For example, ifm = 4, there are 


8 = 23 regions as shown in the figure. 


0.2. Sets 5 


6. If p is a statement, let ~ p denote the statement “not p,” called the negation of p. 
Thus, ~ p is true when p is false, and false when p is true. Show that if ~q=> ~ p, 
then p => q. [The implication ~ q > ~ p is called the contrapositive of p => q.] 


0.2 SETS 


No one shall expel us out of the paradise which Cantor has created for us. 
—David Hilbert 


Everyone has an idea of what a set is. If asked to define it, you would likely say that 
“a set is a collection of objects” or something similar. However, such a response 
just shifts the question to what a collection is, without any gain at all. To add 
to the problem, when you think of concrete examples of sets, such as the set of 
all atoms in the earth, or even of more abstract examples, such as the set of all 
positive integers, you can see at once that the idea of a set is closely related to 
another idea, that of membership in a set. These ideas are so fundamental that we 
make no attempt to define them, taking them as primitive concepts in the theory 
of sets. We then use them to define the other concepts of the theory intuitively. 
Certain basic properties of sets must be assumed (the axioms of the theory), but it 
is not our intention to pursue this axiomatic development here. Instead, we rely on 
intuitive ideas about sets to enable us to describe enough of set theory to provide 
the language of abstract algebra. 

Hence, we consider sets and call the members of a set the elements of the set. 
Sets are usually denoted by uppercase letters and elements by lowercase letters. 
The fact that a is an element of set A is denoted 


aca. 


If A and B are sets, we say that A is contained in B if every element of A is an 
element of B. In this case, we say that A is a subset of B and write 


ACB or equivalently BDA. 


The intuitive idea that two sets are the same if they have the same elements is 
reflected in the following axiom. 


Principle of Set Equality. If A and B are sets, then 
A=B ifandonlyif ACB and BCA. 


This principle is useful because often the easiest way to show that A = B is to verify 
separately that A C B and B C A. We use it frequently, often without comment. 

If it is not the case that A = B, we write A + B. Similarly, we frequently use 
the notations z ¢ A and AZ B. If AC B but A ¥ B, we write A C B and refer to 
A as a proper subset of B. 


Several important sets of numbers are represented by special symbols: 
N —the set of natural numbers (positive integers and zero) 
Z —the set of integers (whole numbers, positive, negative, and zero) 
Q —the set of rational numbers (quotients ™ of integers, where n # 0) 


6 0. Preliminaries 


R —the set of real numbers 
C —the set of complex numbers 


These notations are used throughout the book. Note that NCZCQCRCC. 
We write Z+, Qt, and IR* for the set of positive elements in these sets. 

The only way to completely describe a set is to specify its elements in some 
unambiguous way. If the set has a finite number of elements, this is often accom- 
plished by listing the elements. Thus, we can describe the set A of positive integers 
that are less than 6 as 


A= {1,2,3,4, 5}. 


We frequently describe the elements in a set as those members of some known set 
that have a certain property. Thus, the set A may be described as follows: 


A={zeZ|1<2< 5}, 


which we read as “the set of elements x in Z such that 1 < 2 < 5.” More generally, if 
p(x) is any statement about the elements x of a known set U, the set of all elements 
x of U for which p(x) is true is denoted 


{x €U | p(x)}. 


This notation has some variations, such as 


{0,3,6} = {x € Z| is a multiple of 3 and 0 <x < 6} 
= {c ER | 2 — 92? + 182 = 0} 
= {3x | 2 = 0,1,2}. 


We use such notations without further comment. 
If a finite set A has n elements, we often denote A as 


A= {a1,a2,°°- On} = {aj | 1<i<n}. 


We denote the number of elements in a finite set A as |A| and call sets with |A| = 1 
singletons. If a set A is not finite, we say that A is infinite and write |A| = oo. 
Sometimes we list infinite sets; for example, B = {3,5,7,---} indicates the set of 
odd integers greater than 1. However, this notation can be ambiguous; for example, 
B could indicate the set of odd primes. Actually, 


B={2+1|keZ,k>1} 


is a much better description of B because it reveals the pattern used to describe 
the elements. Nonetheless, we use descriptions such as B = {3,5,7,---} when the 
meaning is clear from the context. 

We assume (this is an axiom) that there exists a set with no elements. This 
set is called the empty set and is denoted @. Thus, {x|2 ER, 2? =-1}=2 
because there is no real number x with x? = —1. The following property of @ is 
used frequently: 


@ C A for every set A. 


0.2. Sets 7 


The verification of this assertion provides a nice example of proof by contradiction. 
Observe that @ ¢ A implies the existence of an element z € @ such that x ¢ A, a 
contradiction since @ has no element. 

Let A,, Ao,-:+ ,An be sets. We define their union A, U A, U:--U A, and their 
intersection A; AoM-::M An as follows: 


A; UA,U-+-UAn = {2 | a € A; for some i = 1,2,--- ,n}, 
Ay NAQN-+:N An = {2 | x € A; for every i=1,2,--+ ,n}. 


These sets sometimes are denoted U?_, A; and NL, Aj, respectively. 

The intersection A,;MAgN-::N wie is a subset of each of the sets A;, and it 
contains every such subset. Similarly, the union A; U Aj U---U A, contains each 
of the sets A; and is contained in every such set. 

If only two sets A and B are involved, we have 

AUB={x|x2¢€AorzeéB, or both}, 


ANB={a|xeEAand ze Bh. 


The use of Venn diagrams, named after the English logician John Venn, clarifies 
many properties of these operations. Points inside some region of the plane (say, 
the interior of a circle) represent the elements of a set. Then the shaded regions in 
the diagram represent the sets AM B and AUB. 


ANB AUB 


Using the principle of set equality, the following properties can be proved for 
arbitrary sets A, B, and C: 
AUA=A AUB=BUA AU(BUC)=(AUB)UC, 
ANA=A ANB=BNA = AN(BNC)=(ANB)NC. 
These are called the idempotent, commutative, and associative laws, 
respectively. In addition, we have the distributive laws 
AU (BLN B2N-::-N Bn) = (AUB) N(AUB YN: N(AUB,), 
AN (B,U Bo U+++U Bn) = (ANB) U(AN Bg) U---U(ANB,). 


The difference A \ B of two sets consists of the elements of A that are not in 
B, more formally 
AN B={«|xeAandz ¢ B}. 
This notation arises frequently, primarily for descriptive purposes. 
The sets {a,b} and {b,a} are equal because the order in which the elements of a 
set are listed is irrelevant. However, taking the order into consideration is frequently 


useful. A pair of elements is called an ordered pair when they are taken to be in 
a definite order. The notation 


(a, 8) 


8 0. Preliminaries 


denotes the ordered pair in which the first member is a and the second is b. The 
defining property is 


(a,b) =(a1,b:) ifandonlyif a=a, and b=by. 


Thus, a and } are uniquely determined by the ordered pair (a, b), and they are called 
the first and second components of the ordered pair. In particular, (a, b) and (6, a) 
are distinct ordered pairs (assuming that a # b), in contrast to the equal sets {a, b} 
and {b,a}. The most familiar use of ordered pairs is in describing the coordinates 
(x,y) of a point in the euclidean plane. 

The cartesian product A x B of two sets A and B is defined to be the set 


Ax B={(a,b)|aeA,be B} 


of all ordered pairs with the first component from A and the second component 
from B. 
The sets A and B can be equal here, and A x A is sometimes expressed as A’. 
For example, if A = {1,2} and B = {1, 2,3}, 
Ase A= AP (1, 1),1152),(271),.(2,2)}, 
AxB= {(1, 1), GE 2), (1, 3), (2, tye (2, 2), (2, 3)}. 
Clearly, R x R is the euclidean plane, and this is the source of the term cartesian. 
The name honors René Descartes, who used such coordinates in his work on 
geometry.” 
By analogy with ordered pairs, we call a set of elements a@1,@2,:-: ,Q@,, an 
ordered n-tuple if they are arranged in a definite order. We use the notation 
(a1, Ge, es Gn) 
for ordered n-tuples, and the defining property is 
(a1, @2,++* , Gn) = (bi, ba, +++ , bn) ifand only if a;—=b; for each i. 
We call a; the ith component of the n-tuple (a1, a@2,:-+ ,an). If Ay, Aa,..., An 
are sets, their cartesian product A; x Ag x... x A, is defined to be the set 
Ay x Ap X...X An = {(@1, @2,++* , Qn) la; € A; for each i} 


of all ordered n-tuples whose ith component belongs to A; for each ¢. 


Exercises 0.2 


1. In each case, describe A in the notation A = {z | p(x)}. 
(a) A is the set of all positive multiples of 5. 
(b) A is the set of all integers between —4 and 8. 

2. List the elements of the following sets. 


(a) {n € N | n3 is odd} (b) {n EN | 2n+1 < 16} 
(c) {ce ER| a3 +32? -cx-3=0} (4d) {4 |n€Z,n#¢0} 
(e) {x €Q| 2? = 2} (f) {nm EN|2<38n+1 < 20} 


? Actually these coordinates were known and used much earlier by Nicole Oresme (1323-1382). 
See Boyer, C.B., A History of Mathematics, New York: Wiley, 1968, p. 379. 


0.8. Mappings 9 


3. Which of the following pairs of sets are equal? Defend your answer. 


(a) A={nEZ|n? < 4} B={zeER|z?-32+2=0} 
(b) A={nEZ|n= 4} B={zeER|2?=1} 

(c) A = the set of letters in “alloy” B= the set of letters in “loyal” 
(d) A = {2, {3}, 4} B = {2,{3,4}} 

(e) A= {1} B= {{1}} 

(f) A= {x ER| 2? = -1} B={rEQ| 2? =2} 

(g) A= {x EZ| az? <1} B={reER|23 =z} 


4. Let A= {1,2,3,4}, B= {1,2,3}, and C = {2,4}. Find all sets X satisfying each 
pair of conditions. 
(a) XC Band X CC (bl) X CAand X ZB 
(c) XC Band X ZC (d)X CBandX ZC 
5. In each case, prove the assertion if it is true or give a counterexample if it is false. 
(We temporarily suspend the convention of denoting elements by lowercase letters.) 
(a) If A€ Band BCC, then AEC. (b) If Ae Band BEC, then AEC. 
(c) IfAE€ Band BCC, thn ACC. (d) If AC Band BEC, thn AEC. 
6. (a) Show that AM B is the largest common subset of A and B in the sense that it 
contains every such common subset. 
(b) Show that AU B is the smallest set containing both A and B in the sense that 
it is contained in every such set. 
7. Prove the distributive laws using the principle of set equality. 
8. Let A and B be sets. If ANX =BNX and AUX = BUX for some set X, prove 
that A= B. [Hint: A= AN(AUX),] 
9. Find sets A, B, and C such that ANBNC=2 but that none of ANB, ANC, 
and BNC is empty. 
10. (a) If A and B are nonempty sets and A x B = B x A, show that A= B. 
(b) Show that Ax B= Bx A if and only if either A= B or one of A and B is 
empty. 
(c) Show that AN B = {a | (a,x) € Ax B}. 
11. (a) Prove that A x (BNC) =(Ax B)N(AxC). 
(b) Prove that A x (BUC) =(Ax B)U(AxC). 
(c) Prove that (AN B) x (A'N B’) = (Ax A!) N(B x B’). 
12. Care must be taken in defining sets. Consider 
R={X |X is aset and X is not an element of itself}. 
Show that R cannot be a set. [Hint: If R is a set, is R a member of itself or not?] 
The assumption that R is a set is called the Russell Paradox, after Bertrand Russell. 


0.3. MAPPINGS 


The concept of a function is basic to all mathematics and real-valued functions are 
essential in calculus and elementary algebra. In this section, we introduce functions 
from any set A to any set B. These more general functions are called mappings to 
avoid confusion. In this generality, sets and mappings are the language of abstract 
algebra. 

In many applications of set theory, we are interested in some property or 
attribute of the elements a of a set A. For example, if A is the set of all 
people, the attribute of a € A might be the age of a or the gender of a. In each case, 
the attribute is itself an element of another set B (in the latter case, B = {F, M} 


10 0. Preliminaries 


will do). Hence, for each a € A, there is a uniquely determined attribute b € B. The 
assignment at b is an example of a mapping.® In general, if A and B are sets, a 
mapping (or function) a from A to B, written 


a:A-~B or ASB, 


is a rule* that assigns to every element a of A exactly one element a(a) of B. This 
assignment is sometimes denoted a+ a(a) (see the diagram). We refer to A and 
B as the domain and codomain, respectively, of the mapping a. For a € A, the 
unique element b € B, such that b = a(a) is called the image of a under a. The 
notion of a mapping is one of the most fertile ideas in mathematics. 

The process of defining a mapping a consists of two parts: First, we must specify 
the domain A and codomain B of a, and then, for every a € A, we must specify 
exactly one element a(a) in B that a@ assigns to a. We refer to this latter task as 
defining the action of a and then say that the mapping is well defined. This can 
be done in several ways. 


a 
A B 


If the domain and codomain are sets of numbers, the most common way to 
define a mapping is by means of a formula. Thus, a(x) = x? + 1 and B(x) = 32-2 
define mappings R — R. Sometimes the mapping is given by a different formula on 
different parts of the domain. For example, 


eZ {1} gvenby —a(n)={ 1 Eni en 


is a mapping. We can describe mappings with a finite domain by simply listing the 
images of the domain’s elements. For example, we can define a: {1,2,3} — {a,b,c} 
by stipulating that a(1) =a, a(2)=a, and a(3)=c. We describe this action 
graphically with an arrow diagram: 

1—za 

9b 


3——>-c 


Example 1. Consider the correspondences a and £ from {1,2,3} to {a, b,c} with 
actions given by the arrow diagrams: 


a B 
we ds 1 a 
A 2 b 
3 c 3 Cc 


3We will usually denote mappings by lowercase Greek letters a, @, 7,.... 

“This definition has the difficulty that “rule” is just a synonym for “mapping.” This is 
circumvented by the formal definition: A mapping a: A B is a set aC Ax B of ordered 
pairs in which every element of A occurs exactly once as the first component of a pair in a. Then, 
for a € A, the unique element b € B such that (a,b) € a is denoted b = a(a). 


0.3. Mappings Ii 


Then a is not well defined because a assigns both a and c to 2, and # is not well 
defined because ( assigns no element to 3. 


Example 2. Let a: Q— Z be given by a(2) =n. Then a is not well defined. In 
fact, let 2 = 4 = 2, Then a(x) = a($) = 1 and a(z) = a(2) = 2, so the element of 
Z assigned to x is not uniquely determined. 

Two mappings are equal if and only if they have the same action. 


Theorem 1. Ifa: A— B and 8: A— B are mappings, then 
a=6 ifandonlyif  a(a) = f(a) for alla€c A. 
Proof. The formal definition presents a and ( as sets of ordered pairs: 


a= {(a,a(a))|ae A} and B={(a,G(a))|ae A}. Now Theorem 1 follows 
from the principle of set equality. | 


Example 3. Show that a=, where a: R-—R and 6: R—R are given for all 
zéR by 

a(t)=a?+2+1 and (x) =(x@-1)(x1+2) +3. 
Solution. The fact that 2? +2+1= (x —1)(a4+2)+3 is an identity in x (that 
is, it is true for all  € R) implies that a = @. Such identities are the basis of many 
of the manipulations of mappings defined by formulas. O 


One-to-One and Onto Mappings 


Let a: A— B be a mapping. For convenience, let us say that an element b € B 
is “hit” by a if b= a(a) for some a € A, that is, if b is the image of some a in A. 
We say that a is one-to-one (or injective) if no element of B is “hit” more than 
once, that is, if (for a and a; in A) 


a(a) = a(a,) implies a=). 


We say that a is onto (or surjective) if every element of B is “hit” at least once, 
that is, 


Every b € B has the form b = a(a) for some a € A. 
A mapping that is both one-to-one and onto is called a bijection and is said to be 
bijective. 
These notions are best illustrated by arrow diagrams. Consider the mappings 
a: {1,2,3} > {a,b,c,d} and 6: {1,2,3,4} — {a, b,c} with the following actions: 


1—~+a 1—+a 

2 b 2 b 

Be B 7 e 
d 4 


Then a is one-to-one (no element is “hit” twice) but not onto (6 is not “hit”), 
whereas (3 is onto (every element of {a,b,c} is “hit”) but not one-to-one (a is “hit” 
twice). 

Example 4. If a: NN is defined by a(n) = 2n +1 for all n € N, show that a 
is one-to-one but not onto. 


12 0. Preliminaries 


Solution. If a(n) = a(m), then 2n + 1 = 2m +1, whence n = m. This shows that a 
is one-to-one. But @ is not onto because no even integer has the form a(n) = 2n +1 
forn éeN. 


Ezample 5. Show that a:R-— R given by a(x) = 2x — 5 is a bijection. 


Solution. If a(x) = a(x), then 2c -5 = 22; —5. This implies that z= 21, so 
a@ is one-to-one, To show that @ is onto, we must demonstrate that each element 
y € R (the codomain) has the form y = a(x) for some z in R. This requirement is 
y = 22 — 5, which has a solution 7 = $(y +5) in R for each y. C 


Ifa:A— B is a mapping, the image of a is the set 
im(a) = a(A) = {a(a) | a € A} 


of all images of elements of A. Thus, a(A) C B, and a is onto if and only if a(A) = 
B. It is convenient sometimes to regard a : A — a(A). With this smaller codomain, 
it is clear that a@ is onto. 

Ifa: A— Bisa bijection, the correspondence a + a(a) pairs every element in 
each of the sets A and B with exactly one element of the other set. In particular, 
if both A and B are finite, they have the same number of elements. We write this 
as |A| = |B], where |X| denotes the number of elements in the finite set X. 

We have presented examples of mappings that are onto and not one-to-one 
and mappings that are one-to-one and not onto. Theorem 2 covers an important 
situation in which these properties are equivalent. 


Theorem 2. Let a: A B be a mapping where A and B are nonempty finite 
sets with |A| =|B|. Then a is one-to-one if and only if a is onto. 


Proof. If a is one-to-one, then a : A > a(A) is a bijection, so |A| = |a(A)|. Hence, 
|a(A)| = |B] and it follows that a(A) = B because a(A) C B and both sets are 
finite. This means that a@ is onto. 

Conversely, let |A| =|B]/=n and write B= {bi,bo,---,bn}, where b; are 
distinct. Let A; = {a € A| a(a) = }} for each 7. Then A= A, UAgU-+:UAn, 
and A;NA;=@ whenever 1#j because }; are distinct. It follows that 
n= |A| =|A1|+|Ae]+---+]An|. But |A;| > 1 for each i (because a is onto), so 
|A;] = 1 for each 7. This implies that a is one-to-one. a 


Composition and Inverse 


Two linked mappings A  B eo may be combined naturally to obtain a mapping 
A — C. In this case, we define the composite mapping 


Ba:A>+C by = Bala)=Bla(a)] for allae A. 


Thus, the action of the composite mapping Ga is “first a, then @” (see the diagram 
on the next page), so the symbol Ba must be read from right to left.° Clearly, the 
composite a3 cannot be formed unless 6(B) C A. But even if a@ and Ga can both 
be defined, they need not be equal. 


5>Many authors write 6 oa for the composite mapping, but we use the simpler notation Ba. 


0.3. Mappings 13 


Example 6. Let a: R—R and 8@:R—R be defined by a(x) =2x+1 and 
G(x) = x? for all x € R. Find the action of Ba and af and conclude that af # Ba. 


Solution. If cweER, then a(x) = Bla(x)] = B(e+1)=(4+1)?, whereas 
aB(x) = a[B(z)| = a(z?) = 2? +1. Clearly, c€R exists with a(x) # Ba(z), 
so af # Ba by Theorem 1. 0 


For a set A, the identity map 1, : A — A is defined by 
la(a)=a_~— forallac A. 


This mapping plays an important role; the notation 1,4 is explained in part (1) of 
Theorem 3. 


Theorem 3. Let AS B 2, CD be mappings. Then 
(1) al4 =a and lpa=a. 
(2) 7(Ba) = (yP)a. 


(3) Ifa and 6 are both one-to-one (both onto), the same is true of Ba. 


Proof. (1) Ifa € A, then a1,4(a) = a[1,4(a)] = a(a). Thus, a1, and a have the same 
action, that is, al4 =a. Similarly, lga =a. 

(2) If a € A: [y(Ba)|(a) = 718a(a)] = 718 (a(a))} = 7Bla(a)] = [(7A)a(a). 

(3) If a and f are one-to-one, suppose that Ga(a) = Ga(ai), where a,a, € A. 
Thus, 8[a(a)] = Bla(a1)], so a(a) = a(a;) because G is one-to-one. But then a = a; 
because a is one-to-one. This shows that Ga is one-to-one. 

Now assume that a and 6 are both onto. If c € C, we have c = G(b) for some 
b € B (because @ is onto) and then b = a(a) for some a € A (because a is onto). 
Hence, c = B[a(a)] = Ba(a), proving that Ga: is onto. | 


We say that composition is associative because of the property y(Ga) = (yB)a 
in (2), and the composite is denoted simply as yfa. Note that the action of this 
mapping is 


Boda) = 7[6la(a)]] 
and so can be described as “first a, then 8, then ” (see the proof of (2)). 


14 0. Preliminaries 


Sometimes the action of one mapping reverses the action of another. For exam- 
ple, consider a : R — R and @: R — R defined by 


a(x) = 2x and = A(z) =3a2 ~~ forallz ER. 


Then fa(z) = Bla(z)| = B(2z) = 4(2x) = x for all 2; that is, Ga = 1p. Hence, 8 
undoes the action of a. Similarly, a@ = 1p. In this case, we say that a and ( are 
inverses of each other. 
In general, if a: 4— 8B is a mapping, a mapping 6: B— A is called an 
inverse of a if 
Ba=1, and ab =1pz. 


Clearly, if @ is an inverse of a, then automatically a is an inverse of 8. As we show in 
Example 8, some mappings have no inverse. However, if @ and (, are two inverses 
of a, we have Gja = 1, and af = 1g. Hence, 


By = Pile = fi(oB) = (1)8 = 148 = B 
by Theorem 3, which proves Theorem 4. 


Theorem 4. Ifa: A —- B has an inverse, the inverse mapping is unique. 


A mapping a: A — B that has an inverse is called an invertible mapping, 
and the inverse mapping is denoted a~!. In this case, a~!: B > A is the unique 
mapping satisfying 


atlta=l, and aa! = 1p. 
We can state these conditions as follows: 
a la(a))=a foralla€A and ala t(b)}=b forallbe B. 


These are the Fundamental Identities relating a and a~', and they show that 
the action of each of a and a~! undoes the action of the other. 

If we have a: A— B and can somehow come up with a mapping 6: BA 
such that Ba =1,4 and aG = 13, then a is invertible and G =a. Here is an 
illustration. 


Example 7. If A={1,2,3}, define a:A—A by a(1)=2, a(2)=3, and 
a(3) = 1. Compute a? = aa and a? = aaa and so find a7}. 


Solution. We have a*(1) = 3, a?(2) =1, and a?(3) = 2, as the reader can verify, 


and so a3(1) = 1, a3(2) = 2, and a3(3) = 3. Thus, a® = 14 and so a*a = 14 = aa’. 


Hence, a: is invertible and a? is the inverse; in symbols a7! = a?. 0 
Theorem 5. Leta: A— B and 6: B >C denote mappings. 

(1) 14: A— A is invertible and 13" = 14. 

(2) If x is invertible, then a~' is invertible and (a~!)7! =a. 

(3) Ifa and @ are both invertible, then Ba is invertible and (Ga)~t = a 1B". 


Proof. (1) This result follows because 14l4 = ly. 
(2) We have a-ta = 1,4 and aa! = 1g, so a is the inverse of a7. 


0.8. Mappings 15 


(3) Compute (Ba)(a1 87") = Blaa|871 = B1p8- = BB-1 =1¢. A similar 
calculation shows that (a~*@-1)(8a) = 1,4, so a~1G71 is the inverse of Ba. Note 
the order of the factors. | 


Example 8. Define a and :N—-N by a(n) =n +1 for alln EN, and 


1, ifn=0, 
A= {ay ifn>0. 


Show that Ga = 1y but that a@ # 1y. Conclude that a is not invertible. 


Solution. We have Ba(n) = B(n +1) = (n+1)-1=n7 for all n EN, so af = 1y. 
However, a3 # 1y because, for example, af(0) = a(1) = 2. Note that 0 ¢ a(N), so 
a is not onto. Hence, a is not invertible by Theorem 6. O 


Theorem 6. Invertibility Theorem. A mapping a: A — B is invertible if and 
only if it is both one-to-one and onto (that is, a is a bijection). 


Proof. Assume a7 exists. If a(a) = a(a;), then a = a! [a(a)| = a! [a(a1)] = a1 


by one of the fundamental identities. Hence, @ is one-to-one. If }€ B, then 
b = afa—1(b)] by the other fundamental identity, and so a is onto. 

Conversely, assume a is onto and one-to-one. Given b € B, there exists a € A 
such that a(a) = 6 (because a@ is onto) and a is unique (because a is one-to-one, 
verify). Hence, we may define @ : B — A by §(b) = a, where a is the unique element 
of A with a(a) = b. Thus, af(b) = a(a) = b for each b€ B, soaf=1g. fae A, 
write a(a) = b. Hence, 3(b) = a by the definition of 8, so Ba(a) = B(b) =a. This 
means that Ba = 1y, so f is the inverse of a. | 


Theorem 6 is important because it can show that a mapping is invertible even 
though no simple formula for the inverse is known. For example, we can show (using 
calculus) that the function a: R— R given by a(x) = 2° + 2z is one-to-one and 
onto. But a simple formula for a~! is not easy to write. 


Exercises 0.3 


1. In each case, determine whether a is a well-defined mapping. Justify your answer. 
(a) a: NN defined by a(n) = —n for alln EN. 
(b) a: NN defined by a(n) = 1 for alln EN. 
(c) a: R->R defined by a(x) = /z for all z ER. 
(d) a: Rx RR defined by a(z,y) = 2+ y for all (z,y) ERxR. 
(e) a: R-+ Rx R defined by a(zy) = (a, y) for all zy € R. 
1—~a 
(f) a: {1,2,3} — {a,b,c} defined by the diagram 2 b 
3 ¢C 


(g) a: {1,2,3} — {a,b,c} defined by the diagram 2 b 


2. In each case, state whether the mapping is onto, one-to-one, or bijective. Justify 
your answer. 


16 


138. 
14. 


0. Preliminaries 


(a) a: R— R defined by a(x) = 3 — 4a. 
(b) a: R — R defined by a(z) = 1+ 27. 
(c) a: N > N defined by 
atl if nis odd, 

ot 
(d) a:Zx Zt — Q defined by a(n,m) = 2. 
(e) a: R- Rx R defined by a(x) = (x +1,2-1). 
(f) a: Ax B —- A defined by a(a,b) =a. (Assume that A # © # B.) 
(g) a: A Ax B defined by a(a) = (a, bo), where bo € B is fixed and A# J. 


a(n) = 
if n is even. 


. Let A B 4 C be mappings. 


(a) If Ga is onto, show that @ is onto. 

(b) If Ba is one-to-one, show that a is one-to-one. 

(c) If Ba is one-to-one and a is onto, show that ( is one-to-one. 

(d) If Ba is onto and f is one-to-one, show that a is onto. 

(e) If 6,: BC satisfies Ga = B,a and a is onto, show that 6 = fy. 

(f) If ay: A B satisfies Ba = Ba, and f is one-to-one, show that a = ay. 


. Fora: A— A, show that a? = 1, if and only if a is invertible and a7! =a. 
. (a) For AS A, show that a? = a if and only if a(x) = = for all z € a(A). 


(b) If A A satisfies a? = a, show that a is onto if and only if @ is one-to-one. 
Describe a in this case. 
(c) Let AS B 1 A satisfy 78 = 1,4. If a = By, show that a? = a. 


. If |A] > 2 and a: A- A satisfies of = Ba for all 6: A— A, prove that a= 1y. 
. In each case, verify that a~! exists and describe its action. 


(a) a: RR defined by a(x) = ax +b, where 0 #a€ Rand beR. 
(b) a: R—- {x €R|z > 1} defined by a(x) = 1427. 


n+1, ifn is even, 
(c) a: NN defined by a(n) = : 
n—1, ifn is odd. 


(d) a: Ax B-+ Bx A defined by a(a,b) = (6, a). 


. Lt AS BSA satisfy Ga = 1,4. If either a is onto or B is one-to-one, show that 


each of them is invertible and that each of them is the inverse of the other. 


Let ASB SA satisfy Ga = 1,4. If A and B are finite sets with |A| =|B|, show 


that af =1g, a= +, and 8 =a". (Compare your answer with the solution of 
Example 8.) 


. For AS BA A, show that both a8 and Ga have inverses if and only if both a and 


6 have inverses. 


. Let M denote the set of all mappings a: {1,2} > B. Define y: M —~ Bx B by 


(a) = (a(1), a(2)). Show that » is a bijection and find the action of y™?. 


. A mapping 6: A— B is called a constant map if there exists bp) € B such that 


6(a) = by for all a € A. Show that a mapping 6: A-— B is constant if and only if 
da=6 for alla: A> A. 

If |A| =n and |B| = m, show that there are m™ mappings A - B. 

Show that the following conditions are equivalent for a mapping a:A-— B, 
where A and B are nonempty. 

(a) @ is one-to-one. 

(b) There exists 6: B — A such that Ba=1y. 

(c) Ify:C + Aand6:C—-A satisfy ay = ad, then y = 6. 


0.4. Equivalences 17 


15. Show that the following conditions are equivalent for a mapping a: A — B, where 
A and B are nonempty. 
(a) @ is onto. 
(b) There exists @: B - A such that a8 =1 ,. 
(c) Ify¥: BC and 6: B->C satisfy ya = da, then y = 6. 

16. If A# O and P(A) = {X | X C A}, show there is no onto mapping a: A — P(A). 
[Hint. Let R= {re A|r¢a(r)}. lf R=a(a), isae Rora¢ R?] 


0.4 EQUIVALENCES 


It often happens that elements of a set are alike in some respect, but they are not 
necessarily equal. For example, similar triangles are alike in that they have the 
same angles, but they need not be equal in size. For another example, two subsets 
of a finite set may be regarded as alike if they have the same number of elements. 
The concept of an equivalence relation unifies such examples in a useful way. 

If A is a set, a subset = of A x A is called a relation on A. For elements a and 
b in A, we customarily write 


a=b to mean (a, 6) is an element of the set = 


and we write a # b when (a,b) is not in =. 
A relation = on a set A is called an equivalence on A if it satisfies the following 
conditions, where a, b, and c denote elements of A: 


(1) a=aforallacA (reflexive property), 
(2) Ifa =b, then b=a (symmetric property), 
(3) Ifa=band b=c,thena=c _ (transitive property). 


If = is an equivalence on a set A, the statement a = 0 is read as “a is equivalent to 
b”, Certainly, equality is an example of an equivalence, and the notation = reflects 
the idea that an equivalence relation is a weakened form of equality. Intuitively, 
a = b holds when a and 0 are alike in some sense. Thus, given an element a of A, 
the set of all elements equivalent to a plays a central role in revealing the structure 
of the equivalence relation. 

More formally, let = be an equivalence on a set A. Given a € A, the equivalence 
class [a] of a is defined as the set of all elements of A that are equivalent to a: 


fa] = {x € A|z =a}. 


The equivalence class [a] is said to be generated by a. 
Examples 1-5 illustrate equivalences. In most cases, we leave verification of the 
three defining properties to the reader. 


Example 1. Equality is an equivalence on any set A. If a € A, the equivalence 
class of a is [a] = {2 € A| zc =a} = {a}, the singleton. 


Example 2. Being parallel is an equivalence on the set of lines in the plane. The 
equivalence class of a given line consists of all lines parallel to it. 


18 0. Preliminaries 


Example 3. If X and Y are subsets of a finite set U, write X =Y to mean 
|X| = |Y]|. Then = is an equivalence on the set of subsets of U, and [X] consists of 
all subsets with the same number of elements as X. 


Example 4. Let a: A— B be a mapping. If a and a, are elements of A, write 
a =a, to mean a(a) = a(a;). Then = is an equivalence on A, called the kernel 
equivalence of a, and [a] = {z € A| a(x) = a(a)} for each ae A. 


Example 5. If m and n are integers, define m =n to mean that m— 7 is even. 
Then = is an equivalence on Z. (Proof of transitivity: If m =n and n=k, then 
both m—n and n— k are even, som —k = (m—n)+(n—k) is also even. Thus, 
m = k.) In this case, 


[0] = {z € Z| x = 0} is the set of even integers, and 
[1] = {x € Z| x = 1} is the set of odd integers. 


Moreover, it is not difficult to verify that [m] = [0] if m is even and [m] = [1] if m 
is odd, and so [0] and [1] are the only equivalence classes. O 


We describe equivalences as the one in Example 5 in more detail in Section 1.3. 
Theorem 1 collects the basic properties of equivalence classes. 


Theorem 1. Let = be an equivalence on a set A and let a and b denote elements 
of A. Then 
(1) a € [a] for everyae A. 
(2) [a] = [0] if and only ifa=b. 
(3) Ifa [b] then [a] = [}). 
(4) If [a] # [6] then [a] N [b] = S. 


Proof. (1) This is clear because a = a for all a € A by the reflexive property. 

(2) If (a] = [0], then a € [}] by (1), so a = b. Conversely, assume a = b. If x € [a], 
then x = a, so, since a = b, we have x = b by transitivity. Thus, x € [b] and we have 
proved that [a] C {b]. The other inclusion [6] C [a] follows in the same way because 
b = a by symmetry. Hence [a] = [8]. 

(3) If a € {b], then a = b, so [a] = [8b] by (2). 

(4) We argue by contradiction. If [a] + [b], we assume on the contrary that 
[a] 9 [b] # SW, say x € [a] N [db]. Then « =a and x =b, so a=b by the symmetric 
and transitive properties. But then [a] = [b] by (2), a contradiction. i 


The view that an equivalence is a weakened version of equality is upheld by (2) 
of Theorem 1. However, the equality is for equivalence classes rather than elements. 
Property (2) is used several times in this book. 


Partitions 


Theorem 1 leads to a useful description of equivalence relations. Two sets X and Y 
are called disjoint if they have no element in common (that is, X NY = ©), and 
a family of sets is called pairwise disjoint if any two (distinct) sets in the family 
are disjoint. 

If A is a nonempty set, a family P of subsets of A is called a partition of A 
(and the sets in P are called the cells of the partition) if 


0.4. Equivalences 19 


(1) No cell is empty. 
(2) The cells are pairwise disjoint. 
(3) Every element of A belongs to some cell. 


If P is a partition of A, (2) and (3) clearly imply that each element of A lies in 
exactly one cell of P. 

The simplest partition of A is the trivial partition P = {A} with just one 
cell: A itself. At the other extreme is the singleton partition P = {{a} | a € A}, 
where every cell is a singleton. 


Example 6, The set A = {1,2,3} has five partitions: 


{A} {{1,2},{3}} {1,3}, {23} {2,3}, 1 {1}, 23, 18) 


Partitions of a set A give rise to equivalences on A in a natural way. If P is a 
partition of the nonempty set A, and if a and b are elements of A, we define a = b to 
mean that a and b are in the same cell of P. Then = is reflexive because each a € A 
lies in some cell, soa =a. The relation = is obviously symmetric. To show that it 
is transitive, we let a = b and b=c. Because b lies in a unique cell, a and c are in 
that same cell; that is, a = c. Hence, = is an equivalence on A, and we say that it 
is the equivalence afforded by the partition P. Surprisingly, every equivalence on 
A arises in this way. 


Theorem 2. Partition Theorem. If = is any equivalence on a nonempty set A, 
the family of all equivalence classes is a partition of A that affords =. 


Proof. The equivalence classes are nonempty and pairwise disjoint by (1) and (4) 
of Theorem 1, and every element of A belongs to some class (the one it generates). 
Hence, the equivalence classes are the cells of a partition. To show that this partition 
affords =, it is enough to show that two elements a and b are equivalent if and only 
if they belong to the same equivalence class. If a=, then [a] = [b] by (2) of 
Theorem 1, so a and b belong to this common class. Conversely, if a and 6 belong 
to class [c], then [a] = [c] = [b] by (3) of Theorem 1, soa=6. | 


Theorem 2 shows that partitions of A and equivalences on A are actually two 
ways of looking at the same phenomenon—classifying the elements of A. On the 
one hand, we classify them by declaring which pairs of elements are equivalent; on 
the other hand, we classify them by partitioning A into disjoint cells. 

For example, equality on a set A is the equivalence afforded by the singleton 
partition of A. At the other extreme, the trivial partition {A} affords the equivalence 
that declares that any two elements of A are equivalent 

If = is an equivalence on A, the set of all equivalence classes is called the 
quotient set and denoted Az. Hence, 


Az = {[a] | a € A}. 
The mapping , 
yp: A> Az givenby y(a)=[a] forallacA 


is called the natural mapping. The natural mapping y is clearly onto and (3) of 
Theorem 1 shows that y(a) = y(a,) if and only if a = aj. In other words, = is the 


20 0. Preliminaries 


kernel equivalence of y (see Example 4). This proves the following consequence of 
the partition theorem. 


Corollary. Every equivalence on a set A is the kernel equivalence of some onto 
mapping with A as domain. 


The fact that the same equivalence class can have different generators leads to 
a minor difficulty when we are defining a mapping whose domain is a quotient set. 
This problem usually arises in the following way. Suppose that = is an equivalence 
on a set A and that a mapping 


a:A-~B 
is given. If we write A= = {[a] | a € A} as before, we are often interested in defining 
g0:Az—>B by — a(lal) = a(a) 


for each equivalence class [a] in A=. The question is whether o is a mapping. The 
problem is that a given equivalence class C could be generated by distinct elements 
of A: 


C = [a] = [a], 


where a # a,. Then o(C) will be a(a) or a(a,), depending on whether we use 
C = [a] or C = [aj]. Clearly, if the action of o is to make sense 


fa] =[a:1] mustimply that  a(a) =a(a). 


Then the assignment of o({a]) = a(a) does not depend on which element a generates 
the equivalence class. We express this conclusion by saying that o is well defined 
by this formula. 


Example 7. Let = be the equivalence on Z defined by m=n if m—n is even 
(Example 5). Show that the mapping o : Z= — {1, —1} is well defined by o({n]) = 
(—1)”. Then show that a is a bijection. 


Solution. To show that o is well defined, we must show that [m] = [n] implies 
(—1)™ = (—1)". But [m] = [n] implies m = n by (2) of Theorem 1 so m — n is even. 
Hence, both m and n are even or both are odd, and (—-1)™ = (—1)” follows. Thus, 
o is well defined. Verification that o is one-to-one is the converse of the argument 
that it is well-defined: If o([m]) = o([n]), then (—1)™ = (—1)", so m and n are both 
even or both odd. Either way m — n is even, som =n. This means that [m] = [n] 
by Theorem 1, proving that o is one-to-one. As a is clearly onto, it is a bijection. 
Note that this result shows that |Z=| = 2, a fact confirmed in a different way in 
Example 5. O 


Exercises 0.4 


1. In each case, decide whether the relation = is an equivalence on A. Give reasons for 
your answer. If it is an equivalence, describe the equivalence classes. 
(a) A= {-2,—1,0,1,2}; @=b means that a? —-a = b3 — b. 
(b) A = {—1,0,1}; a =} means that a? = b?. 
(c) A={zER|x>0}; c=y means that cy = 1. 


0.4. Equivalences al 


(d) A=N; a@=b means that a < b. 

(e) A=N; a =b means that b = ka for some integer k. 

(f) A = the set of all subsets of {1,2,3}; X =Y means that |X| = |Y]. 

(g) A = the set of lines in the plane; x = y means z is perpendicular to y. 

(h) A=Rx R; (2, y) = (v1, y1) means that x? + y? = 2? + y?. 

(i) A=R xR; (2, y) = (v1, y1) means that y — 32 = y, — 324. 

. Let U = {1,2,3} and A =U x U. In each case, show that = is an equivalence on A 
and find the quotient set Az. 

(a) (a, b) = (ay, 01) ifa+b= ay + by. 

(b) (a, b) = (a1, bi) if ab = ayb1. 

(c) (a, b) 4 (a1, b1) ifa= ai. 

(d) (a,b) = (a1,b1) ifa-—b=a,—by. 

. In each case, show that = is an equivalence on A and find a (well-defined) bijection 
o0:Az—-> B. 

(a) A= Z; m=n means that m? =n?; B=N. 

(b) A= Rx R; (z,y) = (@1,y1) means that 2? + y? =2?+y?; B={zeER|z> 0}. 
(c) A=RxR; (2, y) = (21, y1) means that y= yi; B=R. 

(d) A= Rt x Rt; (a, y) = (z1,y1) means that y/x = y1/21; B= {z ER| x > Of. 
(ec) A=R; «© =y means that r—y €Z; B={xeER|0<a2< 1}. 

(f) A= Z; m =n means that m? — n? is even; B = {0,1}. 

. Find all partitions of A = {1,2,3, 4}. 

. Let Py = {Cy,Co,-+- , Cm} and Pz = {Di, D2,:-+ , Dn} be partitions of a set A. 

(a) Show that P = {C;N D; | C; 1 D; # OD} is also a partition of A. 

(b) If =1, =e, and = denote the equivalences afforded by P,, P2, and P, respectively, 
describe = in terms of =; and =.. 

. Let = and ~ be two equivalences on the same set A. 

(a) If a =a, implies that a ~ a), show that each ~ equivalence class is partitioned 
by the = equivalence classes it contains. 

(b) Define & on A by writing a & a, if and only if both a = a, and a ~ ay. Show that 
& is an equivalence and describe the © equivalence classes in terms of the = 
and ~ equivalence classes. 

. In each case, determine whether a : Qt — Q is well defined, where Q* is the set of 
positive rational numbers. Support your answer. 


@a(m=n (ble) =teme ()a(k)=m+n (d) ofS) =e 


. Define = and~ on Rbyc=yife—yeZandbyzryife-yEeQ 

(a) Show that = and ~ are equivalences. 

(b) Show that a:R=-— Ry is well defined and onto if a([z]z) =(z].. Is the 
mapping @ one-to-one? 

. For a mapping a: A — B, let = denote the kernel A c B 
equivalence of a and let y: A— Az denote the 

natural mapping. Define 


o:Az—B by  o({al) = a(a) g 


for all equivalence classes [a] in Az. 
(a) Show that o is well defined and one-to-one, 
onto if a@ is onto. As 


22 


10. 


0. Preliminaries 


(b) Show that a = oy, so that a is the composite of an onto mapping followed by 
a one-to-one mapping. 
(c) If a(A) is a finite set, show that the set Az of equivalence classes is also finite 
and that |A=| = |a(A)|. (This result is called the Bijection Theorem). 
(d) In each case, find |Az| for the given mapping a. 
(i) ASUxU_ with U={1,2,3,4,6,12}, a:4—7Q defined by 
a(n,m) = n/m. 
(ii) A= {ne Z|1<n< 99}, a: AN defined by a(n) = the sum of 
the digits of n. 
Let A={a|a:P—Q is a mapping}. Given pe P, define = on A by a= 6 if 
a(p) = B(p). 
(a) Show that = is an equivalence on A. 
(b) Find a mapping 4: A — Q such that = is the kernel equivalence of 4. 
(c) If }Q| = n, how many equivalence classes does = have? [Hint: Exercise 9.] 


| 


Chapter 1 


Integers and Permutations 


God made the integers, and all the rest is the work of man. 


—Leopold Kronecker 


The use of arithmetic is a basic aspect of human culture. Anthropologists tell us 
that even the most primitive societies, because of their desire to count objects, have 
developed some sort of terminology for the numbers 1,2, and 3, although many go 
no further. As a culture develops, it needs more sophisticated counting to deal with 
commerce, warfare, the calendar, and so on. This leads to methods of recording 
numbers often (but by no means always) based on groups of 10, presumably from 
counting on the fingers. Then the recording of numbers by making marks or notches 
becomes important (in bookkeeping, for example), and a variety of systems have 
been constructed for doing so. Many of these systems were not very useful for adding 
or multiplying (try multiplying with Roman numerals), and the development of our 
positional system, originating with the Babylonians using base 60 rather than 10, 
was a great advance. 

In this chapter we assume the validity of the elementary arithmetic properties of 
the integers and use them to derive some more subtle facts related to divisibility and 
primes. Then two fundamental algebraic systems are described: the integers modulo 
n and the permutations of the set {1,2,...,n}. These are, respectively, excellent 
examples of rings and groups, two of the basic algebraic structures presented in 
detail in Chapters 2 and 3. 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


23 


24 1. Integers and Permutations 
1.1 INDUCTION 


Great fleas have little fleas upon their backs to bite 'em, And little fleas have lesser fleas, 
and so ad infinitum. 


—Augustus De Morgan 


Consider the sequence of equations: 


l=1 
14+3=4 
14+3+5=9 


1+3+5+7=16 


It is clear there is a pattern. The right sides are the squares 17, 27,37, 4?,..., and, 
when the right side is n”, the left side is the sum of the first n odd integers. As the 
nth odd integer is 2n — 1, the following expression is true for n = 1, 2,3, and 4: 


14+3+5+---+(2n-1)=n’. (Dn) 


Now it is almost irresistible to ask whether the statement (p,) is true for every 
n > 1. There is no hope of separately verifying all these statements, because there 
are infinitely many of them. A more subtle approach is required. 

The idea is to prove that py => Dre41 for every k > 1. Then the fact that p; is 
true implies that po is true, which in turn implies that p3 is true, then p4, and so 
on. This is one of the most important axioms for the integers. 


Principle of Mathematical Induction®. Let p,, be a statement for each integer 
n > 1. Suppose that the following conditions are satisfied: 


(1) py is true. 
(2) pe => Pr4i for every k > 1. 
Then p, is true for every n > 1. 


In the proof that p, => p41, we assume that p;, is true and use it to prove that py+i 
is also true. The assumption that p; is true is called the induction hypothesis. 

For a graphic illustration, consider an infinite row of dominoes labeled 1, 2,3,... 
standing so that if one is knocked over, it will knock the next one over. If p, is the 
statement that domino k falls over, this means that py = px41 for each k > 1. The 
principle of induction asserts that knocking domino 1 over causes them all to fall. 

As another illustration, let p, be the statement 1+ 3+4+5+:--+(2n—-1) =n? 
mentioned above. Then p; has already been verified. To prove that p, => pr41 for 
each k > 1, we assume that p, is true (the induction hypothesis) and use it to 
simplify the left side of the sum py41: 


L4+3454-+-+(2k—1) + (2k+1) =k? + (Qk +1) = (K+ 1). 


5One of the earliest uses of the principle is in the work of Francesco Maurolico in the 
16*" century. Augustus De Morgan coined the name mathematical induction in 1838. 


1.1. Induction 25 


This expression shows that p,41 is true and hence, by the induction principle, that 
pn is true for all n > 1. 


Example 1. Prove Gauss’ Formula’: 1+2+-:-4+n= sn(n +1) for alln > 1. 


Solution. Let py denote the statement 1+2+---+n= $n(n+1). Then py is true 
because 1 = s(1 +1). If we assume that p, is true for some k > 1, we get 


14+24+34---+h+(k+1) = $k(K4+1) + (B41) =$(k41)(k+4 2), 


which shows that p,+1 is true. Hence, p, is true for all n > 1 by the principle of 
mathematical induction. O 


Example 2 gives an inductive proof of a useful formula for the sum of a geometric 
series 1+a2+-+--+ a”. We use the convention that 2° = 1 for all numbers z. 


Example 2. If x is any real number, show that 
(l—-ax)i+ta+---+2%1)=1-2", foralln>1. 


Solution. Let pp be the given statement. Then p is (1—2)1=1 --, which is 
true. If we assume that p, is true for some k > 1, then the left side of p,41 becomes 


(’—a2)(1tat-:-takt4o') = (1-2)(1tat-:-+a%)+(1—-2)a* 
=(1—a2*)+(1-2)a2* 
=1-—g*+1, 
This proves that pz41 is true and so completes the induction. O 


Example 3. Let wp, denote the number of n-letter words that can be formed using 
only the letters a and b. Show that w, = 2” for alln > 1. 


Solution. Clearly, a and b are the only such words with one letter, so w; = 2 = 2'. 
If k > 1, we obtain each such word of k+1 letters by adjoining an a or ab toa 
word of k letters, and there are w, of each type. Hence, wz41 = 2w, for each k > 1 
so, if we assume inductively that w, = 2*, we get wri1 = 2w_ = 2° OR OVE ais 
required. O 


The principle of induction starts at 1 in the sense that if py is true and py => pr4i 
for all k > 1, then p, is true for all k > 1. There is nothing special about 1. 


Theorem 1. If m is any integer, let pm, Dm+1;Pm+2,--. be statements such that 
(1) pm is true. 
(2) py => Pe+i for every k > m. 

Then p,, is true for each n > m. 


This formula was probably known to the ancient Greeks. However, the great mathematician 
Carl Friedrich Gauss is said to have derived a special case of the formula (n = 100) at age 7 by 
writing the sum 1+2-+---+ 100 in two parts: 

1 + 2 +--.+ 49 + 50 
100 + 99 +--+» +524 51 


and observing that each pair of terms, 1+ 100,2+99,...,50+51, adds to 101. As there are 50 
such pairs, the sum is 50-101 = 5050. 


26 1. Integers and Permutations 


Proof. Let tr =Pm+4n-1 for each n>1. Then tj = pm is true, and ty > th41 
because Pm+k-1 => Pm+k- Hence, tp is true for all n > 1 by induction; that is, p, is 
true for alln > m. 


Example 4. If n > 8, show that any postage of n cents can be made exactly using 
only 3- and 5 cent stamps. 


Solution. The assertion clearly holds if n = 8. If it holds for some k > 8, we consider 
two cases: 


Case 1. One or more 5 cent stamps are used to make up k cents postage. 
Then replace one of them with two 3 cent stamps. 


Case 2. Three or more 3 cent stamps are used to make up k cents postage. 
Then replace three of them with two 5 cent stamps. 


Because one of these cases must occur (as k > 8), the assertion holds for k + 1 cents 
in both cases and the induction goes through. Oo 


If n > 1 is an integer, the integer n! (read n-factorial) is defined to be the 
product 
nl = n(n —1)(n—2)---3-2-1 


of all the integers from n to 1. Thus, 1! = 1, 2! = 2, 3! = 6, and so on. Clearly, 
(n+1)!=(n+1)n!, for eachn>1, 


which we extend to n = 0 by defining 
Ool=1. 
Example 5. Show that 2” <n! for alln > 4. 
Solution. If pz is the statement 2" < k!, note that pi, p2, and p3 are actually false, 
but p4 is true because 24 = 16 < 24 = 4!. If py is true where k > 4, then 2" < k! so 
Qh+L a 2.28 <2-kl < (k+1)kl = (k+1)! 


Hence, pz41 is true and the induction is complete. O 


Let n and r be integers with 0 < r < n. The binomial coefficient (”) is defined 
as follows: 


(*) = ay: 
As 0! = 1, we have (") =1=(") and (f)= ane} It is easy to verify that 
(") =(,",)) whenever 0<r<n. 


We leave the proof of the following formula (the Pascal identity) as Exercise 13. 


( is ) 1 (*) = (eo whenever 1 <r <n. 


r—1 Tr 


1.1. Induction 27 


The name honors Blaise Pascal. The identity leads to a way of displaying the 
binomial coefficients known as Pascal’s triangle: 


The n‘® row of the triangle is (5) (7) (3) ---(,%,) (®), starting at n=0. The 
Pascal identity shows that each entry in a given row (except at the ends) can 
be found by adding the two entries adjacent to it in the row above. Hence, Pascal’s 
triangle is easy to write down row by row.® 

The entries in each row also arise in another way. The formulas 


(l+2)? =1+22+27, 


(1+2)3 =1+4 324 32? +23, 
(1+2)4 =1+ 42 +62? + 42° + 2+, 


are easily verified, and the coefficients on the right side in each case are the integers 
in rows 2,3, and 4 of Pascal’s triangle. The general result follows by induction, and 
will be used several times in this book. 


Example 6. Prove the Binomial Theorem: 
(i+a)"=(3)+ (Dat Ge? +--+ (a, for alln>0. 


Solution. The theorem holds if n = 0 because () = 1 and (1+2)° =1. If it holds 
for some k > 0 then, using the Pascal identity, we obtain 


(1+0)*? = (1+2)(1+2) 
=(1+2)[(@) + (et--+ Gk )et1 + Bet] 
= (+ [+ @]ot--+ (G4) + @] ett Dat 
=D) + CPete + CP )at + Ge, 


which completes the induction. O 

When proving inductively that statements pm, Dm4i,---,Dk are true, the most 
difficult part is usually showing that py, => pp41 for each k > m. Clearly, this task 
would be easier if we could assume the truth of pm,...,De—1 in addition to the 


truth of py when deducing p;,41. This assumption leads to a useful variant of the 
principle of induction (in fact, it is equivalent to it). 


®Note that this shows the binomial coefficients are all integers, a fact that is not clear from the 
definition. 


28 1. Integers and Permutations 


Theorem 2. Principle of Strong Induction. Let m be an integer and, for each 
n>m, let pn be a statement. Suppose the following conditions are satisfied. 


(1) pm is true. 
(2) Ifk > m and all of Pm, Pm+1)+++)Pk are true, then p;,+1 is also true. 
Then p, is true for every n > m. 


Proof. For each n > m, let tn be the statement that pm,Pm4i,.-+)Pn are all true. 
Then, t,, is true by (1). If ty is true for some k > m, then (2) implies that pz+1 is 
true, so t,41 is also true. Hence, t, is true for all n > m by Theorem 1, so certainly 
Dn is true for all n > m. | 


In the next example, we use strong induction to prove an important fact about 
primes that would be more difficult to deduce using (ordinary) induction. Recall 
that a prime number (or prime) is an integer p > 2 that cannot be factored as a 
product of two smaller positive integers. 


Example 7. Show that every integer n > 2 is a product of (one or more) primes. 


Solution. This assertion is true if n = 2 because 2 is a prime. If k > 2, we assume 
inductively that 2, 3,...,& are all products of primes. To apply strong induction, 
we must show that k + 1 is a product of primes. This is clear if k + 1 is itself prime; 
otherwise, let k + 1 = ab, where 2<a<k and 2<6<k. Then both a and 5 are 
products of primes by the (strong) induction hypothesis, so k+1= ab is also a 
product of primes. O 


We conclude with an intuitively clear property of Z that is equivalent to the 
principle of induction, and which is usually taken as an axiom. 


Well-Ordering Principle. Every nonempty set of nonnegative integers has a 
smallest member. 


Proof. If the principle is false, let X C {0,1,2,...} be a nonempty set that has 
no smallest member. For each n > 0, let p, be the statement “n ¢ X.” It suffices 
to show that p, is true for all n > 0—since then X is empty, contrary to our 
assumption. We prove this by strong induction. First, pp is true because if 0 € X, 
then it is the smallest member of X (because X C {0,1,2,...}). Now assume in- 
ductively that po, p1,...,p% are all true, so that none of 0,1,...,4 is in X. This 
implies that k + 1 ¢ X since otherwise it would be the smallest member of X. This 
means pp+1 is true, and so completes the induction. O 


The way the well-ordering principle is used can be illustrated by the following 
frivolous example: Suppose that we want to show that every positive integer is 
interesting. If this assertion were false, the set of uninteresting positive integers 
would be nonempty and so would contain a smallest member by the axiom. But 
the smallest uninteresting integer would surely be interesting—a contradiction! This 
technique can also be applied to serious situations. 

For example, the well-ordering principle implies the induction principle. Indeed, 
let pi, pe, p3,... be statements such that p,; is true and pz => pz4i for every k > 1. 
If X = {n> 1 py is false}, we must show that X is empty. But if not, then X has 
a smallest member, which leads to a contradiction. The details are in Exercise 15. 


1.1. Induction 29 


We have proved the following implications (the first is Theorem 2): 
Induction = Strong Induction = Well Ordering. 


Moreover, well ordering implies induction (see above), so the three principles are 
logically equivalent. The validity of these principles is one of the basic Peano 
axioms? for the integers. 


Inductive Definition 


Many arguments in algebra (in fact, in mathematics generally) refer to sequences 
Go, 41, 42, @3,*** ,Gn,:-: from a set A where each a; is an element of A called 
the i** term of the sequence. Hence 1, 2, 4, 8, 16,... are the first five terms of the 
sequence a, = 2” from Z. This sequence can be compactly described as follows: 


Qo =1 and a, =2an_; for eachn> 1. *) 


These conditions uniquely describe the sequence (the formula a, = 2" for n > 0 
can be proved by induction), and for this reason (*) is called an inductive definition 
of the sequence. More generally, a sequence is said to be defined inductively if 
the first term is specified and each later term is uniquely determined by the earlier 
terms (often by a formula). It is usually very difficult to give an explicit formula for 
the n* term a, in terms of the earlier terms; nevertheless, the following theorem 
shows that such a sequence always exists and is uniquely determined. 


Theorem 3. Recursion Theorem. Given a set A anda € A, there is exactly one 


SEQUENCE Ag, 21, G2, A3,..-,;4n,... from A that satisfies the following requirements: 
(1) ao =a. 
(2) For each n > 1, the term a, is uniquely determined by the preceding terms 
Qo, A1, A2,..-,4n-1- 


Proof. The existence of such a sequence is given in Appendix D; we prove uniqueness 
by strong induction on n > 0. Clearly, ap is uniquely determined by (1). If each of 
Ag, @1, 42,...G,-1 has been uniquely specified, then a, is uniquely determined by 
(2). Hence, the sequence is uniquely determined by (1) and (2). | 


Exercises 1.1 


1. Prove each equation by induction on n. 
(a) 145+9+--+4+ (4n—3) =n(2n—1) for alln > 1. 
(b) 1742? 4---+n? = gn(n+1)(2n+1) for alln > 1. 
(c) 134+ 23 4+---4+n3 = $n?(n +1)? for alln > 1. 
(d) 1-24+2-34---+n-(n+1) = §n(n+4 1)(n +4 2) for alln > 1. 
(e) 1-27 42.3% +4.--+n-(n +1)? = Bn(n+4 1)(n+ 2)(8n +5) for all n > 1. 
(f) gat ag Fo Sey at Jt all m= 1. 
(g) 12 +3? +---+(2n—1)? = (4n? — 1) for alln > 1. 


°Named after Giuseppe Peano, an Italian mathematician and logician who, in 1889, reduced 
the theory of the natural numbers N to five simple axioms. For a discussion of this, see R.A. 
Beaumont and R.S. Pierce, The Algebraic Foundations of Mathematics, Addison-Wesley, 1963. 


30 


on 


10. 


11. 


12. 


138. 
14, 


15. 


16. 


1, Integers and Permutations 


(h) 12 — 2? 4.32 —.-- 4 (—1)"44n? = 3(—-1)™+! n(n + 1) for alln > 1. 
(i) B+ 3434-4 3p =1- ain for alln > 1. 


. Prove each inequality by induction on n. 


(a) n < 2” for all n > 0. 

(b) n? < 2” for alln > 4. 

(c) n! <2” for all n > 4 (compare with Example 5). 

() gt+ete+4 <2-+ foralln>1. 

(e) ty tet x 2 VM for alln2 1. 

(f) AtBtotE <2/n-1 for alln > 1. 

Prove each statement by induction on n. 

(a) n3 + (n +1)? + (n+ 2)? is a multiple of 9 for all n > 1. 

(b) n° — n is a multiple of 3 for all n > 1. 

(c) 327+1 4 97+? is a multiple of 7 for all n > 0. 

Show that (1- #) (1- aw): (1-4) = 31 for all n > 2. 

Show that 3°" +1 is a multiple of 7 for all odd n> 1. 

Suppose that n straight lines in the plane are positioned so that no two are parallel 
and no three pass through the same point. Show that they divide the plane into 
$(n? +n + 2) distinct regions. 

Show that there are 3” positive integers with n digits, where each digit must be 4,5, 
or 6. 


. A polygon in the plane is called convez if every line joining two vertices is either an 


edge or lies entirely within the polygon. If n > 3, show that the sum of the interior 
angles of an n-sided convex polygon equals (n — 2) - 180°. 


. A straight line segment joining two distinct points on a circle is called a secant. For 


n> 1, draw n secants with no two identical. Show that the resulting regions can be 
unambiguously colored black and white (where unambiguously means that no two 
regions sharing a straight line boundary are of the same color). 

(a) Show that any postage of n > 2 cents can be made of 2 and 3 cent stamps. 

(b) Show that any postage of n > 12 cents can be made of 3 and 7 cent stamps. 

(c) Show that any postage of n > 18 cents can be made of 4 and 7 cent stamps. 

(d) Can you generalize from the results in (a)—(c)? 

Let a, = 2°" —1 for n >0. Guess a common divisor of each a, and prove your 
assertion. 

(a) Try to prove the statement “13 + 2° +--.+n3 is a perfect square” by induction. 
Now look at Exercise 1(c). 

(b) Try to prove that 1+ 3 + ; shes ae < 2 by induction. Now formulate a stronger 
equality for the sum on the left, prove it by induction, and use it to deduce the 
inequality. 

Prove the Pascal identity: (7) +(2) = (ee) forl<r<n. 

(a) Show that (3)+(2) + (3) 5 i (") = 2" for alln > 0. 

(b) Show that ()-(@) + (5) set (”) =O0ifn>0. 

Use the well-ordering principle to prove the principle of induction. [Hint: See the 
discussion following the well-ordering principle.] 

Let X be a nonempty set of integers. Then X is said to be bounded below (bounded 
above) if an integer m exists such that m < @ for all x € X (respectively m > «x for 
all « € X). 


17. 
18. 


19. 


20. 


21. 


22. 


23. 


24, 


25. 


26. 


27. 


1.1. Induction 31 


(a) If X is bounded below, show that it has a smallest member. 
(b) If X is bounded above, show that it has a largest member. 
Use strong induction to prove that every integer n > 2 has a prime factor. 
In each case, conjecture a formula for a, and prove it by induction. 
(a) ap = 2,4n41 = —An, n> 0. 
(b) @o = 1,01 = —2, Qnyo = 2Gn — Qn41, n> 0. 
(c) ag = 1, Qn41 =1 — An, % > 0. 
(d) ao = 3, an41 = (@n)?, n 2 0. 
Let n lines in the plane be such that no two are parallel and no three are concurrent. 
Find the number a, of regions into which the plane is divided by first showing that 
On+1 = On +(n+1). 
Prove the following induction principle. 
Let m be an integer and let p, be a statement for all n > m. Assume that 

(1) Pm and pm+i are true. 

(2) If k > m and both p, and p,+1 are true, then p;49 is true. 
Then p, is true for all n > ™. 
Let a, denote a number for each integer n > 0 and assume that ani2 = Q@n41 + 2an 
holds for every n > 0. Use the principle in Exercise 20 to prove each assertion. 
(a) If a9 = 1 and a, = —1, then a, = (—1)” for each n> 0. 
(b) If ag = 1 and a; = 2, then a, = 2” for each n > 0. 
(c) If ag = p and a; = q, then a, = 4[(p + q)2” + (2p — q)(—1)"] for each n > 0. 
Let p, denote the statement: “3n +2 is a multiple of 3.” Show that p, > pr4i for 
all & > 1. What does this say about Theorem 1? 
Let p, denote the statement: “In any class of n algebra students, every student 
obtains the same grade.” Then py is clearly true. If p,, is satisfied for n > 1, suppose 
that £1, %,...,2%n41 denotes a class of n+1 students. Then 21,%2,...,2% all have 
the same grade (by induction) as do 22,23,...,2n41. Thus %1,%2,...,%n41 all have 
the same grade (the same as Zp), SO Pn4i is true. Hence, pp, is true for all n. What 
is wrong with this argument? 
Suppose that p, is a statement about n for each n > 1. In each case what must be 
done to prove that p, is true for all n > 1? 
(a) Dn => Pn+2 for each n > 1. 
(b) Pn => Pn+e for each n > 1. 
(c) Pn => Pn+1 for each n > 10. 
If py is a statement about n for each n > 1, argue that p, is true for all n > 1 if 
Pn => Pn-1 for each n > 2 and pp, is true for infinitely many values of n. 
For a sequence a4, @2,..., suppose that a, + a2 +---+a, is to be evaluated. 
(a) If a sequence by, be,... can be found such that a, = bn41—b, for all n> 1, 
prove by induction that a, + a2 +-++++@, = bry — 0. 
(b) Use the technique in (a) to evaluate 1-2-342-3-4+---+n(n+1)(n+4+2). 
(Hint: Try bp = (n — 1)n(n + 1)(n +4 2),] 
Suppose that a sequence ao, @,... is given. 
(a) Show that the sequence sp, s1,... exists where so = ap and s, is the sum of the 
first n+ 1 of a;. 
(b) Show that the sequence po, pi1,... exists where pp= ao and p, is the product of 
the first n + 1 of the a;. 


32 1. Integers and Permutations 
1.2 DIVISORS AND PRIME FACTORIZATION 


Mathematics is the queen of the sciences and number theory is the queen of 
mathematics. 


—Car] Friedrich Gauss 


The set Z of integers will be used in several ways throughout this book: as a major 
source of examples of algebraic systems; to state definitions and prove theorems 
(often by induction); and as a prototype for results about more general systems. 
For the most part, the properties of Z that we need are familiar facts about addition, 
multiplication, and ordering of the integers, although we present a more detailed 
look at these properties in Section 3.2. However, we also utilize several less familiar 
properties of divisibility and primes in Z and so devote this section to them. 


The Greatest Common Divisor 


When we write 22/7 in the form 33 we are using the fact that 22 = 3-7 + 1; that is, 
22 leaves a remainder of 1 when divided by 7. The general result is a consequence 
of the well-ordering axiom. 


Theorem 1. Division Algorithm. Let n and d>1 be integers. There exist 
uniquely determined integers q and r such that 


n=qd+r and O<r<d. 


Proof. Let X ={n—td|te€Z, n—td>0}. Then X is nonempty. In fact, 
if n>0, then n=n—Od is in X; if n<0, then n—nd=n(1—d) is in X. 
Hence, by the well-ordering principle, let r be the smallest member of X. Then 
r= n-— qd for some g and r > 0, so it remains to show that r < d. But if r>d, 
then 0<r—d=n-—(q+1)d. This means that r—d is in X, contradicting the 
minimality of r. This result proves the existence of g and r. 

To prove uniqueness, suppose also that n= q'd+7r’ with 0 <r’ <d. Assume 
r <r’ (the case r’ < r is similar). Then (q — q’)d = r’ — r is a nonnegative, integral 
multiple of d that is less than d (because r’ —r <r’ < d). This can occur only if 
r =r’, which implies that g = q’ and so proves uniqueness. A 


For n and d> 1, the integers g and r in Theorem 1 are called the quotient 
and remainder, respectively. Thus, for example, if we divide n = —17 by d=5, 
the result is —17 = (—4) -5+4+ 3, so the quotient is —4 and the remainder is 3. 

The division algorithm can also be seen 
geometrically. If the real line is marked off ES Camere 
in multiples of d, n clearly falls either on 
a multiple gd of d or between gd and (q+ 1)d 
(see the diagram). Hence, gd<n< (q+1)d, so 0<n-—qd<d, and we take 
r=n-—qd. 

If both n and d are positive and a calculator is available, the quotient ¢ and the 
remainder r can be easily found as follows: Calculate 4 and let q denote the largest 
integer that is less than or equal to 4. Hence, 


i a a 
If we multiply through by d, we get 0 <n — qd < d, so take r =n — qd. 


nm (qt1)d 


1.2. Divisors and Prime Factorization 33 


Example 1. Find the quotient and remainder if n = 4187 and d = 129. 


Solution. We have 4 = 32.457 approximately, so q = 32. Then r =n — dq =59, 
and so 4187 = 32-129 + 59, as desired. 0 


If n and d are integers, d is called a divisor of n if n = qd for some integer q. 
When this is the case, we write d|n. If djn is not true, we write d{n. Thus, 7/84 but 
7 { 85. Note that 1|n and n|0 for all integers n. The following properties of divisors 
will be used frequently. 


Theorem 2. Let m,n and d denote integers. 

(1) nn for all n. 

(2) If d|m and m|n, then d|n. 
(3) Ifd|n and nld, then d = +n. 
(4) Ifd|n and dlm, then d|(an + ym) for all integers x and y. 
Proof. The proofs of (1) and (2) are left to the reader. In (3), let nm = gd and d = pn 
for integers p and g. If d=0, then n=qd=0=d. If d#0, then d= pn = pqd, 
which implies that 1 = pq. As p and q are integers, this means that p= q = 1 or 
p=q=-1, and sod=n or d=~—n, which proves (3). As to (4), ifn =ad and 
m = bd in (4), then an + ym = (xa + yb)d, so d|(zn + ym), as required. @l 
Expressions of the form n+ ym, where z and y are integers, are called linear 
combinations of n and m. 


Example 2. If d > 1 is such that d|(3k + 5) and d| (7k + 2) for some k, show that 
d=lord=29. 


Solution. The hypotheses and (4) of Theorem 2 imply that d divides the linear 


combination 7(3k + 5) — 3(7k + 2) = 35 — 6 = 29. Hence, d is a positive divisor of 
29,sod=lord=29. oO 


An integer d is called a common divisor of two integers m and n if djm and 
d\n. To motivate the next theorem, consider the positive divisors of 36 and 84: 


e Positive divisors of 36: 1, 2,3, 4,6, 9, 12, 18, 36 
e Positive divisors of 84: 1, 2,3, 4, 6, 7, 12, 14, 21, 28, 42, 84 
¢ Common divisors: 1, 2,3, 4,6, 12 
We wish to focus attention on the fact that the largest common divisor 12 is actually 


a multiple of all the other positive common divisors. This idea is built into the 
following definition. Let m and n be integers. 


An integer d is called a greatest common divisor of m and n if: 
(1) d>1 

(2) d|m and d|n 
(3) Ifk|m and k 


n, then k\d. 
When it exists we write d = gcd(m,n). 


For example, gcd(18, 30) = 6, ged(6, 7) = 1, and ged(—9, 15) = 3. 
Conditions (2) and (3) can be stated as follows: ged(m,n) is a common divisor 
of m and n by (2), which is a multiple of every common divisor by (3). If it exists, 


34 1. Integers and Permutations 


d= gcd(m,n) is unique. In fact, if d’ is another integer satisfying (1), (2), and (3), 
then d'{d by (3). Similarly, d|d’ so d = +d’ by Theorem 2. But then d’ = d because 
we insist that greatest common divisors are positive. 

The following fundamental theorem shows that, if m and n are not both zero, 
then d = gcd(m,n) does indeed exist and, surprisingly, that d is actually a linear 
combination of m and n. 


Theorem 8. Let m and n be integers, not both zero. Then d = gcd(m,n) exists 
and d= xm -+ yn for some integers x and y. 


Proof. Let X = {rm+yn|z,y € Z, zm+yn> 1}. Then X is not empty because 
m* +n? € X,so let d be the smallest member of X (by the well-ordering principle). 
Since d € X, we have d>1 and d=azm+ yn for integers x and y. Also, if k|m 
and k|n, then k|(am + yn) = d by Theorem 2. So it remains to show that d|m and 
d|n. 
To show that d|m, write m = qd+r where 0 <r <d-—1. Then, 
r=m—qd=m-—q(am-+ yn) = (1— qr)m + (—gy)n. 
Hence, if r > 1, then r € X and r < d, contradicting the choice of d. So r = 0, that 
is, m = qd. Thus, d|m, and d|n is proved similarly. a 


Note that gced(m,n) does not exist if m=0O=n (verify), which explains the 
requirement in Theorem 3 that m and n are not both zero. Also, the greatest 
common divisor of m and n can be a linear combination of m and n in more than 
one way. For example, gcd(2,3) = 1 and we have 1 = 2-1—3 and1=3-—2. 


Example 3. If p and q are distinct primes, show that gcd(p, q) = 1. 


Solution. Write d= gcd(m,n). Then dlp, so d=1 or p. Similarly, d= 1 or q, so 
d= 1 because, otherwise, p = d = q is contrary to the assumption that p#q. O 


The next example (which is needed later) illustrates how the definition of the 
greatest common divisor is used. 


Example 4. If m = qn-+r, show that gcd(m,n) = ged(n,r). 


Solution. Write d = gcd(m,n) and k = gcd(n,r). Then k divides both n and r and 
so divides m = qn+r. Thus, k is a common divisor of m and n, so k|d because 
d= gcd(m,n). A similar argument (using r = —gn + m) shows that d|k, sod = +k 
by (3) of Theorem 2. Hence, d = k, because both d and k are positive. O 


How do we compute d = gcd(m,n) in general? There is an efficient procedure 
for doing so, which also shows how to express d as a linear combination 
of m and n. To illustrate how it works, consider the numbers 78 and 30. The idea 
is to use the division algorithm repeatedly. First divide 78 by 30: 


78 = 2-304 18 
30 = 1-184 12 
18=1-124+6 
12=2-6+0 


At each stage (after the first) we divide the divisor at the previous stage by the 
remainder at that stage. The last nonzero remainder is 6, and this equals gcd(78, 30). 


1.2. Divisors and Prime Factorization 35 


This is no coincidence as we shall see. To express 6 as a linear combination of 78 
and 30, eliminate the remainders from the second last lineup: 


6=18-—1-12 

= 18 — (30 —1- 18) 
= 2-18-30 

= 2(78 — 2-30) — 30 
=2-78—5-30 


This procedure is called the euclidean algorithm, and it works in general. For 
positive integers m and n, not both zero, we use the division algorithm repeatedly: 


m= gnt+ry, m1 <n 
Nn =QgeT1 +72 T2 <1, 
T1 = Q3T2 + T3 r3 <7 


At each stage we divide the divisor at the previous stage by the remainder, so the 
remainders form a decreasing sequence of nonnegative integers: 


m>T>Teg>7T3>°::>0. 


Clearly, we must encounter a remainder of 0 (in at most n steps). If r; denotes the 
last nonzero remainder, the last two equations are 


Te2=Qreitre and rei = Qe4i7Te +0. 
Now, repeated application of the result in Example 4 gives 
gcd(m,n) = ged(n,7ri) = ged(ri1,r2) = +++ = gced(re_1, Tt) = rt. 
Hence, ged(m, n) really is the last nonzero remainder. 
Example 5. Find gcd(41, 12) and express it as a linear combination of 41 and 12. 


Solution. The algorithm is not needed to find gcd(41,12). In fact, 1 and 41 are 
the only positive divisors of 41, so gcd(41,12) = 1 because 41 does not divide 12. 
However, guessing a linear combination 1 = x -41+y-12 is not easy. The euclidean 
algorithm gives 


41 =3-1245 
12=2-5+42 
5 =2-:2+41 
2 =2-1+0 
Hence, ged(41, 12) = 1 as expected. Elimination of remainders gives 
1=5-2-2 
= 5 — 2(12 —2-5) 
=5-5+-2-12 
= §(41 —3-12) -—2-12 
=5-41-17-12 


which is the required linear combination. O 


36 1. Integers and Permutations 


The following definition will be used frequently throughout this book. 
Two integers m and n are called relatively prime if gcd(m,n) = 1. 


For example, 2 and 3 are relatively prime, as are 20 and 9. Note that 1 is relatively 
prime to every integer n. The condition in Theorem 4 is useful. 


Theorem 4. Let m and n be integers, not both zero. Then m and n are relatively 
prime if and only if 1 = 2m + yn for some integers x and y. 

Proof. If ged(m,n) = 1, then 1 = 2m + yn by Theorem 3. Conversely, if l=am+yn, 
then any common divisor of m and n must divide 1. In particular, gcd(m,n) = 1.8 
For example, any two consecutive integers k and k + 1 are relatively prime because 


(k+1)—k=1. Similarly, 5(6k + 5) — 6(5k + 4) = 1 shows that 6k +5 and 5k +4 
are relatively prime for any integer k. 


Corollary. If d = gcd(m,n), m,n € Z, then % and % are relatively prime. 
Proof. Ifd = xam+yn, 2,y € @, dividing by d gives l= 24 +y4. O 


The following theorem contains two very useful properties of relatively prime 
integers, and will be referred to several times below. 


Theorem 5. Let m and n be relatively prime integers. 
(1) Ifm|k and n|k for some integer k, then mn\k. 
(2) If m|kn for some integer k, then m|k. 


Proof. We first prove (1). By Theorem 4, let 1=am+yn, where x and y are 
integers. If k = gm and k = pn where p and gq are integers, then 


k=1-k=amk+ynk = em(pn) + yn(qm) = (cp + yg)mn. 
Hence, mn|k, proving (1). As to (2), let nk = gm where q is an integer. Then, 
k=1-k=amk + ynk = amk + y(qm) = (xk + yq)m. 
This shows that m|k, and so proves (2). | 


Prime Factorization 


Clearly, every integer n > 2 has at least two positive divisors: 1 and n. The integers 
for which these are the only positive divisors are important. An integer p is called 
a prime if it satisfies the following conditions: 


(1) p22. 
(2) Ifd|p and d > 0, then either d= 1 ord=p. 


Thus, the first few primes are 2,3,5,7,11,13,.... We know (Example 7 §1.1) that 
every integer greater than 1 is a product of primes; the reason for not regarding 1 
as a prime is to ensure that this factorization is unique (see Theorem 7). 

If the product of two integers is even, one of these integers must be even (because 
the product of two odd integers is odd). We can rephrase this statement as follows: 
If 2|mn, where m and n are integers, then 2|m or 2|n. This statement holds for any 
prime in place of 2. 


1.2. Divisors and Prime Factorization 37 


Theorem 6. Euclid’s Lemma. Let p denote a prime. 
(1) If p|mn where m and n are integers, then p|m or p|n. 
(2) If p|mim2-+++m, where each m, is an integer, then p|m; for some i. 


Proof. (1) Write d = gcd(m, p). Then d|p, so d= 1 or d= p because p is a prime. 
If d= p, then p|m because d|m; if d = 1, then p|n by (2) of Theorem 5. 

(2) This assertion follows by induction on r. If r = 1, it is obvious. If (2) holds 
for some r > 1, let plmime2-++:m,-m,41. Then (1) shows that either p|m,---m, or 
p\mr41. In the first case, p|m; for some i = 1,2,...,r by the induction hypothesis. 
Hence, in any case, p|m; for some i = 1,2,...,7 +1, completing the induction. @ 


Note that Euclid’s lemma fails for nonprimes. For example, 6 is a divisor of 3-4, 
but 6 does not divide 3 or 4. 

It is not too difficult to convince yourself that every integer n > 2 is either a 
prime itself or can be factored as a product of primes—just keep factoring as long as 
possible. For example, 12 = 2?-3, 25 = 5”, and 360 = 23-3?-5. In fact, every integer 
greater than I is a product of primes, and this factorization is unique up to the order 
of the factors. 


Theorem 7. Prime Factorization Theorem. 


(1) Every integer n > 2 is a product of (one or more) primes. 
(2) This factorization is unique up to the order of the factors. That is, if 


N=Ppipe'::pr and nN=qiG2'"'s; 


where p;, and q; are primes, then r = s and q; can be relabeled 
so that p; = q; for alli =1,2,...,7r. 


Proof. We proved (1) in Example 7 §1.1. If (2) fails, let (by the well-ordering 
principle) m > 2 be the smallest integer with two distinct factorizations into primes: 


™ = Pip2+**Pr = G192°°* Ws: 


Then m is not a prime (verify), so r > 2 and s > 2. We have pi|q192--- ds, 80 p1|q; 
for some j by Euclid’s lemma. By relabeling g;, we may assume that pi|qi. Then 
pi = gq, because both are primes, so 

pe = P2-* Pr = 2°" Gs 
is an integer—smaller than m—that admits two distinct factorizations into primes. 
This result contradicts the choice of m, and so proves (2). a 


Corollary. Two integers m > 2 and n> 2 are relatively prime if and only if no 
prime divides both m and n. 


Proof. Write d = gcd(m, n). If d = 1, then any common prime divisor would have to 
divide 1, so no such common divisor exists. Conversely, suppose no prime divides 
both m and n. If d> 1 and pld where p is a prime, then p|m and p|n, contrary to 
our assumption. So d = 1, that is m and n are relatively prime. 


If n > 2 is an integer and 71, po,...,p, are*the distinct prime divisors of n, the 
prime factorization theorem asserts that n can be written uniquely in the form 


— M172 n 
= Py Poo Pr’, 


38 1. Integers and Permutations 


where n; > 1 for each i. This means that the primes p; and the integers n; are 
uniquely determined by n. For example, 60 = 2?-3-5 and 882 = 2-3?-7?. 

If n has only one prime divisor, we call it a prime power, examples being 
7=7', 9=3?, and 32 = 2°. At the other extreme, we say that n is square free 
if all the exponents n; = 1. Hence, any prime is square free as are 6 = 2-3 and 
70=2-5-7. 

If n is not prime, it must have a prime divisor p < 4/n (it cannot have two prime 
divisors greater than ,/n). So to test whether n is prime, it suffices to verify that 
it has no prime divisor p < \/n (which is impractical if n is very large). 


Example 6. Factor 1591 into primes. 


Solution. We start dividing 1591 by the successive primes, 2,3,5,7,.... Since 
1591 < 40 (because 40? = 1600), we need go only as high as 37; in fact, the first 
prime that divides 1591 is 37. As 1591 = 37-43 and 43 is a prime, we have the 
required prime factorization. O 


Obviously, the method in Example 6 requires that we have a list of the primes. 
Although large tables of primes are available, the method clearly fails for very large 
numbers. Finding the prime factorization of large integers is very difficult. Even so, 
on December 15, 2005 it was announced that 29° 4°7:457 _ 1 is a prime with 9,152,052 
digits, the largest prime known to that date. Such a result requires a very large 
amount of computer time.° 

The prime factorization theorem gives a systematic way of listing all the positive 
divisors of an integer n when the prime factorization of n is known. For example, 
ifn = 12 = 23 - 3, these divisors are 1,2,3,4,6, and 12, and they can be written as 


1203: D2 O18), 273? 

$=293! 6013! 19973! 
Thus, they can all be expressed as 273°, where 0<r<2 and 0<s<1 (where 
p° = 1 for any prime p). The general situation is as follows: 


Theorem 8. Let n be an integer with prime factorization 


— mth pt2 
N= Py py? s+ prr, 


where p; are distinct primes and n, >1 for each i. Then the positive divisors 
of n are precisely the integers d of the form: 


did 
d = pips? -+- per, 


where 0 < d; < n; holds for each i. 


Proof. The prime divisors of d are contained in {pi,...,p,} by Euclid’s lemma, 
and d cannot contain a higher power of p; than p;* by Theorem 7. | 


In much the same way, the prime factorization theorem provides a simple way 
to compute the greatest common divisor of any finite set of positive integers (rather 


?°On the other hand, in 2002, Maninda Agrawal and two undergraduate students (Neeraj Kayal 
and Nitin Saxena) gave a simple algorithm that can decide whether a given integer n is prime or 
not. Moreover, the time taken is approximately a polynomial function of n. This is an important 
breakthrough in computer science. 


1.2. Divisors and Prime Factorization 39 


than just two). It also provides the “dual” notion, the least common multiple. The 
definitions are as follows. Let n1,n2,...,n, be positive integers. 


(1) The greatest common divisor gced(n1, n2,..., 7) of these integers is the 
positive common divisor that is a multiple of every common divisor. 

(2) The least common multiple lem(n;,72,...,n,) of these integers is the 
positive common multiple that is a divisor of every common multiple. 


Thus, gcd(4, 6, 10) = 2 and Icm(4, 6, 10) = 60 by inspection. Theorem 9 below shows 
that the gcd and lcm always exist. They are uniquely determined in the same way as 
the gcd of two integers (see the discussion preceding Theorem 3). The next example 
illustrates a systematic method for finding the gcd and lcm. 


Example 7. Find d = gcd(12, 20,18) and m = lem(12, 20, 18). 


Solution. We might find d=2 by experiment, but m= 180 is not clear. A 
systematic method involves writing the prime factorizations as follows: 


12 = 22.31.50 
20 = 27.39.51 
18 = 2). 32.50 


We have d = 2%. 3°. 5° for some a,b, and c by Theorem 8. We have a < 1 because 
d|18, and b=c=0 because d{20 and d|12. Thus, d = 2 is the largest possibility. 
Similarly, write the prime factorization of m as m = 2? -37-5"-k, where k > 1 is 
the factor involving primes (if any) other than 2,3, or 5. Then p > 2 because 12|m 
(or because 20|m), g > 2 because 18|m, and r >1 because 20|m. The smallest 
possibility is thus m = 2? . 32.51 = 180. oO 


In Example 7, the power of 2 in d = gcd(12, 20, 18) is the smallest of the powers 
of 2 occurring in 12,20, and 18; the same is true for the powers of 3 and 5 in d. 
Similarly, the power of 2 in m = Iem(12, 20,18) is the largest of the powers of 2 in 
12, 20, and 18, with similar statements for the primes 3 and 5. This method works 
in general. For finitely many integers a,b,c,..., let 


max(a,b,c,...) and min(a,b,c,...) 
denote the largest and the smallest of these integers, respectively. For example, we 
have max(3, 1, —5,3) = 3 and min(1,0,5) = 0. 
Using Theorem 8, the solution to Example 7 extends to a proof of Theorem 9. 

Theorem 9. Let {a,b,c,...} be a finite set of positive integers, and write 

@ = pips? ++ par 

b= p>) pb are -pPr 

c= pT ps? one “per 


where p; are primes dividing at least one of a,b,c,..., and where an exponent 
is zero if the prime in question does not occur in that number. Then, 


k ; 
ged(a, b,c,...) = py ph? -- «pr, 
lem(a, b, c,...) = ppp? «+. pm, 


where k; = min(a;, bi, ci,...) and m; = max(a;, bj, c;,...) for each 1. 


40 1. Integers and Permutations 


Example 8. Find gcd(63, 60,105) and lem(63, 60, 105). 
Solution. The prime factorizations are 
63 = 29375971, 60 = 27815179, and 105 = 2931547). 
Hence, gcd(63, 60, 105) = 2°315°7° = 3 and Icm(63, 60, 105) = 27375!7! = 1260. O 


Of course we can use Theorem 9 to find lcm(a, b) and gcd(a, b) for two integers 
a and b. However, the euclidean algorithm is also available to compute gcd(a, b), so 
the next result is useful for finding Icm(a, b). 


Corollary. If a and b are positive integers, then lcm(a, ) « gcd(a, b) = ab. 


Proof. The assertion follows from Theorem 9 and the fact that, for integers m and 
n, max(m,n) + min(m,n) = m+n. Ba 


Note that lcem(a, b,c) - ged(a, b,c) # abc can occur (consider Example 8). 
We conclude with one last application of the prime factorization theorem. 


Theorem 10. Fuclid’s Theorem. There are infinitely many primes. 


Proof. Suppose, on the contrary, that there are only n primes, denoted p,,p2,...,Dn.- 
Then consider the integer m = 1+ pipo:-:Pn. Since m > 2, some prime divides m 
by Theorem 7. But if p;|m, then p; divides m — pip2:+:pm = 1, a contradiction. 
Hence the assumption that there are only finitely many primes is untenable. | 


Euclid’s theorem certainly implies that there are infinitely many odd primes, 
that is, primes of the form 2k +1, k =0,1,..., and a natural question is whether 
there are infinitely many primes of the form mk + n for any positive integers m and 
n. This clearly cannot happen unless m and n are relatively prime. However, in this 
case it is valid, a result first proved by P.G.L. Dirichlet. One instance of Dirichlet’s 
theorem is treated in Exercise 39. 

However, there are many unanswered questions about primes, among them the 
celebrated Goldbach conjecture, which asserts that every even integer greater 
than 2 is the sum of two primes..The conjecture dates from 1742 and originated in 
some correspondence between C. Goldbach and L. Euler. It is not known whether 
this assertion is true; the question appears to be extremely difficult to answer. The 
best result known is that every sufficiently large even number is the sum of a prime 
and a number that is the product of at most two primes. 


Exercises 1.2 


1. In each case find the quotient and remainder when n is divided by d. 


(a) n = 391,d=17 (b) n = 401, d= 19 

(c)n=—-116,d=13 (d) n= —162,d=17 
2. In each case write r = n — qd, as in Example 1. 

(a) n = 51837, d = 386 (b) n = 39214, d= 871 


3. If n and d#0 are integers, show that integers q and r exist such that n= qd+r 
and 0 <r < |dl. 

4. Show that the negative divisors of an integer n are just the negatives of the positive 
divisors. 


OH 


10. 
11. 
12. 
13. 
14. 
15. 
16. 
17. 
18. 
19. 
20. 
21. 


22, 
23. 
24. 
25. 
26. 
27. 
28. 
29, 


30. 


31. 


32. 
33. 


1.2. Divisors and Prime Factorization 41 


If m and n are odd integers, show that m? — n? is divisible by 8. 


. Given three consecutive integers, show that one must be a multiple of 3. 
. (a) If d>0, di(11k+ 4), and d|(10k+ 3) for some integer k, show that d=1 or 


d=7. 
(b) If d>0, d|(35k+ 26), and d|(7k+ 3) for some integer k, show that d=1 or 
d= 11. 


. Explain why ged(0,0) does not exist. If mn > 0, what is gcd(0,n)? 


In each case, compute gcd(m,n) and express it as a linear combination of m and n. 


(a) m= 72, n= 42 (b) m= 41, n = 25 
(c) m= 327, n= 54 (d) m= 198, n = 241 
(e) m = 877, n = 29 (f) m= 527, n= 31 
(g) m= 72,n=—-175 (h) m = —231, n = 150 


If m > 1, show that m|n if and only if ged(m,n) =m. 

Let d= gcd(m,n). If k|d, k > 1, show that ged(™, 2) = 4. 

If m and n are relatively prime and k|m, show that k and n are relatively prime. 

Is n? +n+ 11 prime for all n > 1? Support your answer. 

Show that gcd(m+n,m) = gcd(m,n). 

If m]m1 and nj|m, show that gcd(m,n)| ged(mi, 11). 

If nlk(n +1), show that nk. 

If gcd(m,n) = 1 and gcd(k,n) = 1, show that ged(mk,n) = 1. 

If gcd(m, n) = 1, let d = gcd(m+n,m—n). Show that d=1 or d=2. 

Show that ged(km, kn) = kgcd(m,n) ifk > 1. 

Show that m and n are relatively prime if and only if no prime divides both. 
Suppose that p > 2 is an integer with the following property: If m and n are integers 
and plmn, either p|m or pln. Show that p must be a prime. 

If d,,...,d, are all divisors of n and if gced(d;,d;) = 1 whenever i # j, show that 
didz---d, divides n. 

If d= ged(a,n), must $ and n be relatively prime? Prove or disprove. 

Show that any two consecutive odd integers are relatively prime. 

Show that 3,5, and 7 is the only prime triple (that is, three consecutive odd integers, 
each of which is prime). It is not known if there are infinitely many prime pairs. 

Let p be a prime. If n is any integer, show that either pin or gcd(p,n) = 1. 

If ged(m, p) = 1 and p is a prime, show that gcd(m, p*) = 1 for all k > 1. 

Show that none of n!+2,n!+3,...,n!+7 are primes for any n > 2. Hence, show 
that there are arbitrarily long gaps in the primes. 

Let ab =a ,b,, where a,b,a;, and b; are positive integers. If gcd(a,b,) =1 and 
gcd(a,,b) = 1, show that a = a; and b= by. 

Find the prime factorizations of the following integers: 

(a) 27783 (b) 1331 (c) 2431 

(d) 18900 (e) 241 (£) 1457 © 

Find the gcd and the lcm of the following pairs of numbers: 

(a) 735,110 (b) 101, 113 (c) 139, 278 (d) 221, 187 

If d = gcd(a,b) and m = ab/d, show that m = lcm(a, b) using only Theorem 3. 

Let n be a positive integer with prime factorization n = p?1p;? ---p?™ where the p; 
are distinct primes and n; > 1 for each 7. 

(a) Show that n has (nj + 1)(n2+1)...(n- +1) distinct positive divisors. 

(b) Write down all the positive divisors of 340, 108, p”, p?q, where p and q are distinct 
primes. 


42 


34. 


35. 


36. 


37. 


38. 


39. 


40. 


41. 


42. 
43. 


44, 


45. 


1.3 


1. Integers and Permutations 


(c) How many positive divisors does n have if n = 25200; n = 41472? 

Ifm>1 and n> 1 are relatively prime integers and nm is the square of an integer, 
show that both m and n are squares. Is this result true if m and n are not relatively 
prime? 

If ged(m,n) = 1, where m>1 and n>1, and if dlmn, show that d= mn, for 
some m,|m and n,|n. [Hint: Theorem 7.] 

Do Exercise 35 without assuming that gcd(m,n)=1. [Hint: If O<e<ftg, 
where f > 0 and g> 0 are integers, show that e can be written e = f,; + 91, where 
O<fi<f and0<g; < g. Use Theorem 8.] 

Let a>1 and b>1 be integers. Show that there exist integers u>1 and v>1 
such that ula, v|b, gcd(u,v) = 1, and lem(u, v) = ab. [Hint: Theorem 9.] 

If g is a rational number such that q? is an integer, show that q is an integer. [Hint- 
If m?|n?, show that m|n using Theorem 7.] 

(a) Show that every prime p > 2 has the form p = 4k +1 or p= 4k +3. 

(b) Modify the proof of Theorem 10 to show that there are infinitely many primes 
of the form 4k + 3. 

A school has n lockers in a row along one side of a hall. The n students run down 
the hall one after the other. The first student closes all the lockers; then the second 
opens doors 2,4,6,...; the third changes doors 3,6,9,... (that is, opens a door if it 
is closed and closes it if it is open); the fourth student changes doors 4,8,12,..., and 
so on. When all n students have gone through, which locker doors remain closed? 
Prove your answer. [Hint: Exercise 33(a).] 

Compute the following: 

(a) ged(28665, 22869) and lcm(28665, 22869) 

(b) gcd(231, 273, 429) and lem(231, 273, 429) 

(c) gcd(1365, 1911, 1155, 1925) and lem(1365, 1911, 1155, 1925) 

Show that gcd(a, b,c) = ged[a, gcd(b, c)]. 

Let d= gcd(ai,a2,@3,..-,@%), Where the a; are positive integers. Show that in- 
tegers 21,22,...,2, exist such that d= 21a, +--:+2,a,. [Hint: Let m be the 
smallest member of X = {a,a;+--++,0,|2,€Z, ziai+-+-+¢pap> 1}, and 
show that m = d. See the proof of Theorem 3.] 

Let b> 2 be a fixed integer. If n >0 is any integer, show that n can be written 
in the form n=r,b'+r4_10°!+---+ryb+ro, where t>0 and 0<1r; <b for 
all ¢. Show further that these integers r; and ¢ are uniquely determined by n. This 
expression is called the base b representation of n. 

Let m > 1 and n> 1 be integers. 

(a) If m=qn+r, g,7r€Z, 0O<r<n, show that 2" —1=2(2” —1)+(2"—1) for 
some x € Z, where 0 < (2"—1) < 2” —1. 

(b) If d= gcd(m,n), show that ged(2™ — 1,2" —1) = 24-1. [Hint: Get d by the 
euclidean algorithm and use (a).] 


INTEGERS MODULO n 


Two integers a and 6 are said to have the same parity if both are even or both are 
odd, that is, if 2|(a — b). The following definition extends this idea and introduces 
an important equivalence on the set Z of integers. Let n > 2 be an integer. 


1.3. Integers Modulo n 43 


Then integers a and b are said to be congruent modulo n if n|(a — 6). 
In this case we write a = b (modn) and refer to n as the modulus. 


Thus, we have 2 = 5 (mod 3), 21 = 16 (mod5), and —4 = 2 (mod6). The expression 
21832 = 32 (mod 100) explains why we can test whether an integer is divisible by 
100 by looking at the last two digits. Note that a = 0 (mod7) if and only if n | a. 
We assume that n > 2 because congruence modulo 0 or 1 is of no interest (verify). 

As the notation = suggests, congruence modulo n is an equivalence relation on 
Z."1 The notation is justified in Theorem 1 and the proof is left as Exercise 6(a). 
Theorem 1. Congruence modulo n is an equivalence on Z; that is: 

(1) a=a (modn) for every integer a. 

(2) Ifa=b (modn), then b=a (modn). 

(3) Ifa =b (modn) and b=c (modn), then a =c (modn). 

If a is an integer, its equivalence class [a] with respect to congruence modulo n 
is called its residue class modulo n, and we write a = {a] for convenience: 


a= [a] = {zc € Z| & =a (modn)}. ° 


The following result will be used frequently below. 
Theorem 2. Given n > 2, @ = b if and only ifa = b (modn). 


Proof. Suppose @ = b. Since a € G, we have a € b, so a= b. Conversely, let a = b. 
Since @ and 6 are sets, we must show that @ C b and b C4. If x € G, then x =a; s0, 
as a = b, we have x =b by (8) of Theorem 1. This proves that a C b. Since b= a 
by (2) of Theorem 1, a similar proof shows that 6 C a. a 


Residue classes are easy to describe. For example, if n = 2, 


0 = {x € Z| x =0 (mod 2)} = the set of even integers 
I= {x€Z|2=1 (mod2)} = the set of odd integers 


In general, if a@ is an integer, the division algorithm gives a=qn+r, where 
0<r<n-—1, soa=r (modn). Thus every residue class modulo n appears in 
the list 0,1,2,...,n —1. In fact it appears exactly once. 


Theorem 3. Let n > 2 be an integer. 
(1) Ifa€ Z, thena =F for some r where0<r<n-1. 
(2) The residue classes 0,1,2,...,n—1 modulo n are distinct. 


Proof. It remains to verify (2). Suppose 7 = §, where O<r<n—JlandO0<s<n-1. 
We may assume that r < s. Then 7 = § means that r= s (modn), so s—r is an 
integral multiple of n such that 0 <s—r<n-—1. This implies that r= s. | 


The set of all residue classes modulo n is denoted 


Z, = {0,1,2,...,2—T} 


11See Section 0.4 for a discussion on equivalence relations. 


44 1. Integers and Permutations 


and is called the set of integers modulo n. Thus, (2) of Theorem 3 is the assertion 
that |Z,| =n. In particular, Z2 = {0,1}, Zs = {0, 1,2}, and so on.” 


Solution. It seems that 48 does not appear. However, 48 = 6 (mod7) means that 
48 = 6 does indeed occur. Similarly, -16 = 5 (mod 7), so —16 = 5 also appears. O 


Example 2. If a is an odd integer, show that @ = 1 or @ = 3 in Z, = {0, 1, 2, 3}. 


Solution. We know that @ is one of 0,1, 2, or 3 in Zq. If @ = 2, then a = 2 (mod 4), 
so a — 2 = 4q for some integer g. This means that a is even, contrary to assumption. 
So a # 2 and, similarly, @ # 0. The only other possibilities are @ = 1 anda=3. O 


Ezample 3. In Z,, show that @ = 0 if and only if nla. 
Solution. By Theorem 2, 4 = 0 means that a = 0 (modn), that is, nla. O 


Congruence modulo n is compatible with addition and multiplication of integers 
in the following sense. Let a,a,,b, and b; denote integers. 


(*) 


i a = a,(modn) a+b=a;, +b; (modn) 
b = bi(modn) ab = a,b; (mod n) 


In fact, let a — a, = pn and b — by = qn, where p and q are integers. Adding these 
equations gives (a + b) — (a, + b1) = (p+ q)n, and this implies that a + 6 =a, + by 
(modn). Similarly, multiplying the equations a = a; + pn and b= b, + gn gives 
ab = a,b, (modn). 

Condition (*) means that the arithmetic of Z extends naturally to Z, as follows: 
We define addition and multiplication of residue classes @ and b in Zn by 


a+b=at+b and ab=ab. (**) 


Of course, we must verify that these operations are well defined, that is, we must 
check that they do not depend on which generators are used for the residue classes 
@ and b. More precisely, suppose that 


a@=a, and b=by, 
where a # a; and b # by are possible. If we add these classes as @ and b, (**) gives 
their sum as a + 6, but if we represent the classes as @ and bj, their sum is a; + 0. 
Clearly, the definition of addition makes no sense unless a + b = a1 + 6). But a = ay 
and b=b; by Theorem 2, so a+a, =b+b; by (*), so a+b=a, +b, as 
required. Similarly, (*) shows that ab = a;61, so the definition of multiplication 


also makes sense. In other words, addition and multiplication of residue classes are 
well defined by (**). 


Example 4. In Zg compute 3+5 and 3-5. 


Solution. The definition gives 3+5=8=2, because 8 =2 (mod6). Similarly, 
3.5 S153) 0 


Note that @ means different things in Zz, Zs, .... so to avoid ambiguity, perhaps we should 
denote residue classes @ in such a way that the modulus is apparent (say, 74 and °@). However, 
this is rarely done in practice as the modulus is usually clear from the context. 


1.3. Integers Modulo n 45 


Theorem 4 collects several properties of these operations in Z,, each of which 
is the analogue of the corresponding property for Z. 


Theorem 4. Let n > 2 be a fixed modulus and let a,b, and c denote arbitrary 
integers. Then the following hold in Z,,. 


(1)@+b=6b+4G and ab=ba. 


(2)@+(b+e)=(4@+b)+e@ and = a(bz) = (ab)z. 
(3)4@+0=G@ and 41=4. 

(4) a+-a=0 

(5) a(b+ 2) =ab+az 


Proof. We prove (5) and leave the rest as Exercise 6(b). Thus, 


a(b+ 2) =a(b+c) (definition of addition in Z,) 
=a(b+c) (definition of multiplication in Z,) 


=ab+ac (property of Z) 
=ab+ae (definition of addition in Z,) 
=Gb+a@ (definition of multiplication in Z,.), 
which proves (5). | 


These properties enable us to do arithmetic in Z,, in much the same way as in 
Z. In particular, (3) shows that 0 and 1 play roles in Z, analogous to those of 0 
and 1 in Z. For this reason, 0 and I are called the zero of Z, and the unity of Zn, 
respectively. Similarly, because of (4), —a is called the negative of @ in Zn, and is 
denoted —a = —&. Then subtraction in Z, is defined by 


é—b=a4+-—b=a—b, 


an operation used much as it is in Z. 


+/0 1 2 3 4 5 Oe Shs 2 ate 
06;0 i 23 4 5 0/0 00 0 0 0 
WoC Be By Be Bo, 0 110° 1 2.3 4-5 
2° |°2. Bo Be SE Od 210 2° A PD a 
S13 45 Ob 2 3-|.0° 30" 3. O38 
4\/4 50 223 4/0 42 0 4 2 
5/5 012 3 4 By |e br 24. BD! 


These tables reveal many differences between the arithmetic of Zg and that 
of Z. For example, while 0 and 1 are the only integers k in Z with the property 
that k? = k, each of 0, 1, 3, and 4 enjoy this property in Zg. Another difference is 
that if ab = ac in Z and a#0, then b=c. But 4:2 =4-5 in Ze, and 440, but 
245, Hence, we must be careful about “cancellation” in Z,. In fact, this concern 
is related to another difference between Z and Z,. If ab =0 in Z, then a = 0 or 
b = 0. However, this need not hold in Z,. For example, 2-3 =0 in Ze, but 240 
and 3 #0. 


46 1. Integers and Permutations 

In Examples 5-7, we use the arithmetic of Z, to deduce facts about Z. The 
connection is the fact (in Theorem 2) that @ = 6 in Z,, means that a = b (modn). 
Example 5. Show that a° = a (mod5) holds for all integers a. 


Solution. For an integer a, it suffices by Theorem 2 to show that a> =4@ in Zs. 


Because @ equals 0,1, 2,3, or 4, we examine each case separately. 
e Ifa=0, then a& = 0° =0 =a. 
¢ If@=1, then a = 1° = 1 =4. 
e Ifa = 2, then a® = 25 = 23.2? =3.4=2=4 
° Ifa=3, then a® = 3° =9-27=4-2=3=4 


eifa=4, then a = 45=16-64=1.4=4=4. 


5 


Hence, G° = @ in every case, so a° = a (mod5) for all integers a. CO 


Example 5 is a special case of Fermat’s theorem, which, for any prime p, asserts 
that a? =a (modp) for all integers a. We return to it later (Theorem 8). 


Example 6. What is the remainder when 41° is divided by 7? 


Solution. If we can show that 4119 =r (mod7), where 0 <r <6, then r is the 
desired remainder. We do the computation in Z7. Note that, as 42 =2 in Zz, 
we have 42 = 8=1. With this in mind, divide the exponent 119 by 3 to get 
119 = 3-39+ 2. Then, 

giro — 433942 — (93)99 . 72 — 739.5 35, 


Hence, 4119 = 2 (mod 7), so the required remainder is 2. Oo 


If a is an integer in decimal notation, it is common knowledge that a is divisible 
by 2 or 5 if and only if the same is true of its unit digit. Example 7 gives a similar 
test for divisibility by 9. 


Example 7. Casting Out Nines. Show that a positive integer is divisible by 9 if 
and only if the sum of its digits is divisible by 9. 


Solution. If a = d,d,_1...d,do in decimal notation, where dp, d1,---,d, are the 
digits, then a = dp + 10d, + 107d2g +---+10"d,. Now 10 = lin Zo, so To’ =i* -7 
for each k. Hence, in Zo, 


G@=do+1-dt+l?-dht+-:-+1"-d.=d +a t+: +d,. 
Thus, @ = d) +d; +:--+d, (mod9), and the result follows from Example 3. O 


These three examples show that the properties in Theorem 4 allow many of the 
operations of ordinary arithmetic to be carried out in Z,. However, these properties 
tell us nothing about how to solve an equation such as ax = b in Z,,. For example, 
consider 

5a = 2 
in Z17. The desired solution (if there is one) is a residue class x in Z47, so x is one 
of 0,1,2,...,16. Hence, one method is simply to try all these classes! If we do so, 
we find that « = 14 is the only solution. However, this method is impractical if the 
modulus is large. 

A better approach is as follows. Suppose that a residue class 6 can be found 
such that 6.5 = 1. Then if we multiply both sides of the equation 5a = 2 by 6, the 


1.8. Integers Modulo n AT 


result is 6-52 = 6-2, that is, c = 2b. The class b (if it exists) can again be found 
by trial and error. In fact 6 = 7 works, so z = 26 = 14, as before. 

Fortunately, there is a systematic way of finding 6 in Zy7 such that b-5 = 1. 
Note that 5 and 17 are relatively prime, so the euclidean ‘algorithm can be used to 
express gcd(5, 17) = 1 as a linear combination of 5 and 17. In fact, we have 


17=3-5+2 andthen 5=2-2+1; 


so, eliminating remainders, 1 = 5 — 2(17 —- 3-5) =7-5—2-17. This implies that 
7-5 =1 (mod17), and so 7-5 =1 in Zz. This gives b= 7. 

This method clearly generalizes. For a modulus n > 2 and an integer a, a residue 
class b in Zp, is called an inverse of @ if ba = Lin Z,,. If @ has an inverse, that inverse 
is unique (Exercise 23) and we say G is invertible. Theorem 5 characterizes when 
an inverse exists, and the proof shows that (as above) the euclidean algorithm can 
be used to find it. 


Theorem 5. Let a and n be integers with n > 2. Then @ has an inverse in Z,, if 
and only if a and n are relatively prime. 


Proof. If a and n are relatively prime, then 1 = gcd(a, n) is a linear combination of 
a and n (by Theorem 4 §1.2), say 1 = ba + cn, where b and c are integers. Hence, 
ba =1 (modn), so ba = 1 by Theorem 2. Conversely, if b exists such that ba = 1, 
then ba = 1 (modn). Thus, n|(1 — ba), say 1 — ba = gn for some integer g. But then 
1 = ba+qn, so a and n are relatively prime (again by Theorem 4 §1.2). | 


Example 8. Find the inverse of 16 in Z35 and use it to solve 162 = 9 in Zg5. 
Solution. The inverse exists as gcd(35, 16) = 1. The euclidean algorithm gives 
35=2-16+3 andthen 16=5-3+1, 


so 1=16 — 5(35 — 2-16) =11-16 — 5-35. Thus, 11-16=1 (mod 35), and so 11 is the 
inverse of 16 in Z35. Now multiply the equation I62=9 by II to obtain 
I] - 162 = 11-9; that is, c = 99 = 29. oO 


Example 9. Find the elements in Zg that have inverses. 


Solution. The members of Zy are of the form 7, where r = 0,1,2,:-- ,8. Since 9 = 3?, 
r is relatively prime to 9 if and only if r is not a multiple of 3. Hence, 1, 2,4, 5, 7, 
and 8 will all have inverses. Indeed, 1 and 8 are both self-inverse, whereas 2 and 5 
are inverses of each other as are 4 and 7. 0 


Example 10. Solve the system e of equations in Z. 
x 


multiplying the second equation by 4 to get x + 8y tract this from the first 


Solution. The usual techniques apply. Since 4:3=1, we eliminate y by first 
= 4, Sub 
of 4 in Zy1, so multiplication by 
=]- 


equation to get 4z = —2 = 9. Now 3 is the inverse 
3 gives c = 3-9 =5. Then the last, equation gives Qy 3a = 8. Finally, 6 is the 
inverse of 2,soy=6-8=4. O 


If a is a real number, an expression x” + ax becomes a square if (4a)? is added: 
x? + ax+ (4a)’ = (a + 1q)? . This process is called completing the square, and 
it works in Z, provided 2 has an inverse in Z, (that is, if n is odd). 


48 1. Integers and Permutations 


Example 11. Solve the quadratic z? + 32 +9 =0 in Zy3. 


Solution. First subtract 9 from both sides to obtain 2? + 32 = —9 = 4. The inverse 
of 2 in Zy3 is 7, so we complete the square on the left by adding (7 - 3)? = 8? = 12 to 
both sides. The result is x? + 32 +12 =4+4 12, that is, (a + 8)? = 3. Now es has 
13 elements and, by inspection, only 2 of them square to 3, namely, 4 and —4 = 9. 
Hence, x +8 =4 or +8 =9, and so zx = 9 and x = 1 are the solutions. O 


Note that there are two solutions in Example 11. The reason is that 3 has two 
“square roots” in Z13: 4 and —4 = 9. However, other situations are possible: In Z7, 
3 has no square root, whereas in Zg7, 9 has six square roots, 3 and —3 = 24, 6 and 
—6 = 21, and finally 12 and —12 = 15. 

The following fact about congruences is useful in number theory and computer 
science, and was known to the Chinese in the fourth century. 


Theorem 6. Chinese Remainder Theorem. Let m and n be relatively prime 
integers. If s and t are arbitrary integers, there exists a solution x € Z to the 
simultaneous congruences 


z=s(modm) and xr=t (mod n). 


Proof. Since gcd(m,n) = 1, the euclidean algorithm gives p and q in Z such that 
l1=mp-+nq. Take 

x = (mp)t + (ngq)s. 
Then z — s = mpt + (ng — 1)s = mp(t — s), so x = s (mod m). A similar argument 
gives 2 =¢ (mod n). | 


The nice thing about Theorem 6 is that the proof gives an algorithm for finding 
the solution 2: The euclidean algorithm gives p and q such that 1 = mp + nq, and 
the solution is « = mpt + nqs. Furthermore, this method can be iterated to solve a 
system of more than two congruences, provided that only the moduli are relatively 
prime in pairs. To illustrate, let m1,me2, and m3 be integers relatively prime in 
pairs. Given arbitrary integers $1, 52, and s3, we want to find an integer x such that 


x= s;(modm,) for each i =1,2,3. 


The Chinese remainder theorem yields a such that a= s; (modm;,) for i= 1,2. 
Since mim, and m3 are relatively prime, apply the Chinese remainder theorem 
again to obtain x such that 


z=a(mod mym2) and x= s3 (mod ms). 


But then z =a (mod mj), so since a = 8; (mod mj), we have 2 = 8; (mod mj). 
Similarly, 2 = s2 (mod mg). 

In general, if m1,™m2,...,m, are relatively prime in pairs, and if 81, s9,..., 5% 
are arbitrary integers, then there exists « € Z such that 


Z=s;(modm,) for eachi=1,2,...,k. 
These general systems of congruences are important in computer science because 


they provide a method for doing arithmetic with integers that exceed the word size 
of the computer (the largest integer that can be used in machine arithmetic). 


1.3. Integers Modulo n 49 


The only elements of Z that have an inverse in Z are 1 and —1 (because 7 
does not lie in Z if k #4 1,—1). Thus, Z resembles Zg in this respect (see the table 
following Theorem 4). At the other extreme, every nonzero real number z # 0 has 


an inverse “ in R. Theorem 7 characterizes when this happens in Z,. 


Theorem 7. The following are equivalent for an integer n > 2. 
(1) Every element 4 # 0 in Z, has an inverse. 
(2) Ifab =0 in Z,, then either 4 = 0 or b =0. 
(3) n is a prime. 


Proof. We prove that (1) = (2), (2) = (3), and (3) = (1). 

(1) > (2). Assume (1) is true and let ab = 0 in Z,. If a = 0, there is nothing to 
prove. Otherwise, @ has an inverse by (1), say G@ = 1. Then we multiply both sides 
of ab = 0 by @ to get cab = 20; that is, b= 0. 

(2) => (3). If n is not prime, let n = ab, where 2<a<nand2<b<n. But 
then ab = 7 = 0, where a #0 and 6+ 0. This contradicts (2), so the assumption 
that n is not prime cannot be valid. 

(3) = (1). If n is prime, let @ # 0 in Z,. Then gced(a,n) = 1 (because otherwise 
gced(a,n) =n, so nia). But then 1 = ba+cn for integers b and c (by Theorem 4 
§1.2), so ba = 1 (modn). Thus, 64 = 1 in Zp, proving (1). a 


Hence, if p is a prime, Zp has the property that every nonzero element has an 
inverse. This is also true of the real numbers R, and such systems are called fields. 
The following consequence of Theorem 7 will be referred to later. 


Corollary. Wilson’s Theorem. If p is a prime, then (p — 1)! = —1 (modp). 


Proof. We write @=a in Z, for convenience. Since p is prime, each element 


1,2,3,...,p—1 in Z, has an inverse by Theorem 7. Hence, pairs of inverses in the 
product (p — 1)! = 12 3---(p—1) will cancel leaving only the self-inverse elements 
1 and —1 (Exercise 26). Thus, (p — 1)! = 1(—1) = —1 in Zp, as required. | 


Example 12. Write down the multiplication table of Zs and illustrate Theorem 7. 


Solution. The first row and column of the table 


consist entirely of zeros (true for any modulus), 
but the fact that no other entry equals 0 verifies 
(2) of Theorem 7. Similarly, the fact that every 
row (or column) except the first contains 1 
verifies (1) of Theorem 7. 


KAI OO) DOE OI] X 
aqacqcao! co 
AI COI DOI Or} 
Col Al AE DOI Or] vol 
Ni HT ool ©}! e981 
pay NE Cor SE Ol] I 


The simplest situation in which Theorem 7 applies is when n = 2. In this case, 
Zo = {0,1} and the addition and multiplication tables are as follows: 


This is binary arithmetic, which is important in the design of computers. 


50 1. Integers and Permutations 


We conclude with a famous theorem of Pierre de Fermat. In Example 5, we 
showed that a° = a (mod 5) holds for all integers a. In fact, it holds if we replace 5 
by any prime. 


Theorem 8. Fermat’s Theorem. If p is a prime, then 
a? =a (modp) for all integers a. 
In fact, a?! = 1 (modp) for all integers a that are relatively prime to p. 


Proof. We must show that @? = @ in Zy. Because this equation is true if @ = 0, it 
suffices to show that a?-1 = 1 in Z, whenever @ # 0. But if @ #0, then @ has an 
inverse in Z, by Theorem 7, say ba = 1. Now multiply all the nonzero elements in 
Z,y by @ to obtain 

al, @2,...,a(p—1). 

These are all distinct (because 47 = @3 yields 7 = § after multiplication by b) and 
none equals 0, so they must be the set of all nonzero elements 1, 2,...,p—1 in 
some order. In particular, the products are the same, and we obtain 


ag (ey era = be ee Pe 


But the element 1 2 --- p — Lis invertible in Z, (Exercise 24). Hence, multiplication 
by its inverse gives GP! = 1, which is what we wanted. ol 


Note that Fermat’s theorem fails if p is not prime; for example, 24 # 2 (mod 4). 
Fermat’s theorem is important in number theory, and the following result will 
be referred to several times. To state it, we use the following useful observation 
(Exercise 36): If prime p > 2 is a prime, then p = 1 (mod 4) or p = 3 (mod4). 
Corollary. Let p > 2 be a prime. 
(1) Ifp=1 (mod4), then x? = —1 in Zp, wherex =12 --- 4(p—1). 
(2) If p =3 (mod 4), then the equation x* = —I has no solution in Zp. 
Proof. Write @ = a in Z, for convenience. 
(1) We have (p — 1)! = —1 by the Corollary to Theorem 7. Write 
q=3(p+1) + (p—2) (p—1). 
Then, 
ag={12--» 3(p—1)) [g(@4+1) --- @—2) (p—1)] =(@-1)!=-1. 
Thus, it suffices to show that g = x. Now observe that we can write q as 
follows: 


q = (—3(p— 1))---(-2) (-1). 
Since p=1 (mod4), the integer $(p—1) is even. Hence, g has an even 
number of factors, and it follows that g = a after all. This proves (1). 
(2) Let p = 4n +3 in Z. Suppose a € Zy satisfies a? = ~1 in Z,; we look for a 
contradiction. Since a?~+ = 1 by Fermat’s theorem, we have 


t=qri- qitt? — (a*)eet =. (—1)?2n+4 —-—lin Zp, 


a contradiction because p > 2. So 2? = —1 has no solution in Zp, proving 
(2). 


of 


1.8. Integers Modulo n 


Clearly, a residue class @ is not the same thing as the integer a. However, because 
the definitions @+b=a+b6 and @4b=ab in Z,, the arithmetic of Z, closely 
resembles that of Z—so much so that in subsequent chapters we adopt the following 


convention (used above in the Corollaries to Theorems 7 and 8): 


Notational Convention. When working in Z, we frequently write the residue 


class G@ simply as a. 


Then Z; = {0, 1, 2,3, 4}, and equations such as 3-4 = 2 and 2+ 3 = 0 appear. This 
notation is harmless, once everyone knows that we are using it, and it facilitates 
hand calculations (the reader as probably been using it already!). Of course, when 


the convention causes confusion, we revert to the more formal @ notation. 


Pierre De Fermat (1601-1685) Fermat was a lawyer by profession and served in the 
parliament in Toulouse, France. His mathematical work was a pastime, and he has been 
called “the prince of amateurs.” This appellation should not be taken as diminishing his 
stature, because he did first-rate work in several areas. He invented analytic geometry 
prior to Descartes and made contributions to the development of calculus. Along with 
Pascal, he is credited with starting the theory of probability. 


However, he is most remembered for his work in number theory. Theorem 8 first appeared 
in a letter in 1640, and a proof was first published much later by Euler. Fermat published 
virtually nothing, and his results became known through letters to his friends (many to 
Mersenne) and as notes jotted in the margin of his copy of Arithmetica by Diophantus, 
usually with no proof. The most famous of these notes is the assertion that, if n > 3, 
positive integers x, y, and z do not exist such that 2” + y” = 2”. This assertion has 
become known as “Fermat's Last Theorem”, and he wrote that “I have found a truly 
remarkable proof but the margin was too small to contain it.” His intuition was so good 
that every other theorem that he claimed he could prove has been subsequently verified. 
However, despite the best efforts of the greatest mathematicians, the “Last Theorem” 
remained open for 300 years. But in 1997, in a spectacular display of mathematical 
virtuosity, Andrew Wiles of Princeton University finally proved the result. Wiles related 
Fermat's conjecture to a problem in geometry, which he solved. 


Exercises 1.3 


1. 


In each case determine whether the statement is true or false. 


(a) 40 = 13 (mod 9) (b) —29 = 1 (mod7) 
(c) ~29 = 6 (mod7) (d) 132 = 0 (mod 11) 
(e) 8 = 8 (modn) (£) 34 = 1 (mod5) 


(g) 84 = 2 (mod 13) 
Tn each case find all integers k making the statement true. 


(a) 4 = 2k (mod7) (b) 12 = 3k (mod 10) 
(c) 83k = k (mod 9) (d) 5k = k (mod 15) 
Find all integers k > 2 such that 

(a) -3=7 (modk) © (b) 7 =—5 (mod k) 
(c) 3 =k? (modk) (d) 5 =k (mod k?) 


Find all integers k > 2 such that k? = 5k (mod 15). 
(a) Show that congruence modulo 0 is equality. 

(b) What can you say about congruence modulo 1? 
(a) Prove Theorem 1. 

(b) Prove (1)-(4) of Theorem 4. 


52 


22. 


23. 


24. 


25. 


26. 
27. 


1. Integers and Permutations 


If a = b (modn) and m|n, show that a = b (modm). 
Find the remainder when : 
(a) 10515 is divided by 7 (b) 8°91 is divided by 5 


. Find the unit decimal digit of 


(a) 31027 (b) 977118 


. Show that the unit decimal digit of k* must be 0,1,5, or 6 for all integers k. 
. If p # 2,3 is prime, show that p= 1 or p=5 in Ze. 
. (a) If a is an integer, show that a? = 0 or a? = 1 (mod 4). 


(b) Show that none of 11,111,1111,11111,..., is a perfect square. Lf 


. Show that a® is congruent to 0,1, or —1 mod 11 for every integer a. 

. Show that a’ =@ in Z, for every integer a using the method of Example 5. 

. Show that a(a+1)(@+2) =0 in Ze for every integer a. 

. Show that a® + 2 is not divisible by 7 for every integer a. 

. Show that a = @ in Ze for every integer a. 

. (a) Show that every integer a has a cube root in Zs (@ = b? for some integer b). 


(b) If n > 3, show that some integer has no square root in Zp. 


. (a) Show that no integer of the form k? + 1 is a multiple of 7. 


(b) Find all integers k such that k? + 1 is a multiple of 17. 


hour of the day will it land? 


. Let n = dydy_ 1 --+dod,do be the decimal representation of n. 


(a) Show that 3|n if and only if 3 divides (dp + dy ++--+ dy). 

(b) Show that 11|n if and only if 11 divides (dp — dy + dz —d3 ++++4dkx). 
(c) Show that 6|n if and only if 6 divides [dp + 4(di + dp +---+4,)]. 

(a) In Zg5, find the inverse of 13 and use it to solve 13a = 9, 

(b) In Zs, find the inverse of 7 and use it to solve r= 1. 

(c) In Zoo, find the inverse of II and use it to solve Ila = 6. 

(d) In Zyg, find the inverse of 9 and use it to solve 9x2 = 14. 

(a) If ab = GZ in Z,, and if @ has an inverse in Zp, show that 6 = Z@. 

(b). If @ has an inverse in Z,, show that the inverse is unique. 

(a) If @ and 6 both have inverses in Z,, show that the same is true for G0. 


. If a space mission takes exactly 175 hours and the craft blasts off at 8 a.m., at what 


(b) If @,,Go,...,Gm all have inverses in Z,, show that the same is true of their 


product G@1G9°:+Gm. 
Find all solutions in Z, (as indicated) for each of the given equations. 


Ba+2y=1. 8a2+4y=1. 
) {St yer om {et a 
3a +2y=1 32+ 4y =1 
oft pe eet @ {et Fe ee: 
Ba+2y=1. 38a+4y=1, 
@ {it yea es {it ee ee 
If p is a prime and a? = G@? in Z,, show that z = @ or x = —@, 


(a) Find all x in Z7 such that 2? +52 4+4=0. 

(b) Find all x in Zs such that 2? +2+3=0. ¥ 

(c) Find all x in Zs such that 2? +2+2=0. 

(d) Find all # in Zg such that 2? +2+7=0. 

(ce) Let n be odd. Show that 2 has an inverse = in Z,. Show that 2? +ax+6 
has a solution in Z,, if and only if (r?2a? — b) is a square in Zp. 


=0 


28. 
29. 


30. 


31. 


32. 


33. 


34, 


35. 
36. 
37. 


1.4. Permutations 53 


Find « € Z such that « = 8 (mod 10), c =3 (mod9), and x = 2 (mod7). 
(a) If a6 =0 in Z, and ged(a,n) = 1, show that 6 = 0. 

(b) Show that @ is invertible in Z, if and only if ab = 0 implies that 6 = 0. 
Show that the following conditions on an integer n > 2 are equivalent. 

(1) a? = 0 in Z, implies that a = 0. 

(2) n is square free (that is, a product of distinct primes). 

[Hint: Theorem 5 §1.2.] 

Show that the following conditions on an integer n > 2 are equivalent. 

(1) If @ is in Z,,, then either @ is invertible or @* = 0 for some k > 1. 

(2) n is a power of a prime. 

If p>3 is a prime, show that every element of Z, has a (p—2)th root. (Hint: 
Use Fermat’s theorem to show that f:Z, — Z, is one-to-one, where f(@) = a?-?. 
Apply Theorem 2 §0.3.] : 
Show that 2°7 — 1 is divisible by 223 and that 25? + 1 is divisible by 641. (Remarkably, 
333 (2°” — 1) is also prime.) Note: If p is a prime, numbers of the form 2? —1 and 
2?" 4.1 are called Mersenne numbers and Fermat numbers, respectively, and 
were once thought to be all primes. 

Let a and n denote integers with n > 2, and write d = gcd(a, 7). 

(a) Show that az = b (modn) has a solution if and only if d|b. 

(b) Ifd=ra-+sn, r and s integers, show that 29 = r(b/d) is one solution. 

(c) If zo is any solution, show that there are exactly d solutions that are distinct 
modulo n: {0,20 + 3,29 +25,...,¢0 + (d—- 1)3} . [Hint: If az = b (modn), show 
that a(x — 29) =0 (modn), so (a/d)(a — xo) =0 [mod(n/d)] by Exercise 11 §1.2. 
Conclude that x — zo = 0 [mod(n/d)].] 

(d) Find all solutions to 152 = 25 (mod 35). 

(e) Find all solutions to 212 = 14 (mod 38). 

(f) Find all solutions to 21z = 8 (mod 33). 

Let p be a prime. If z? = 1 in Z,, show that s =lorz=—I. 

Let p be a prime, show that either p = 1 (mod 4) or p = 8 (mod 4). 

(a) Show that if a” = a (mod) holds for all integers a, the modulus n must be square 
free, that is, a product of distinct primes. : 

(b) Show that a°°! = a (mod 561) for all integers a. [Hint: Use Theorem 5 81.2 to 
reduce the problem to showing that a°°! =a (modp), where p= 3,11, or 17. In 
each case, use Fermat’s theorem in the form a?-! = 1 (modp) whenever p does not 
divide a.] 


1.4 PERMUTATIONS 


A permutation of the numbers 1,2, and 3 is a rearrangement of these numbers in 
a definite order. Thus, the six possibilities are 


123 132 213 23 1 312 3.21 


They can also be described as mappings {1, 2,3} — {1, 2, 3}: 


lol 1-1 1-2 1-2 1-38 1-3 
2-2 2-3 2-1 2-3 2-1 272 
3-3 3-2 3-3 3-1 3-2 3-1 


54 1. Integers and Permutations 


We use this terminology of mappings to describe permutations. 

If X and Y are sets, recall that a mapping a: X — Y is a rule that assigns to 
every element x of X exactly one element a(x) of Y, called the image of x under a. 
Hence, the diagram 


11 

2-3 

3 2 
describes the mapping a: {1, 2,3} — {1, 2,3} given by the rule a(1) = 1, a(2) = 3, 
a(3) = 2. 

Now consider a mapping a: {1,2,...,n}— {1,2,...,n}. Because such map- 
pings occur frequently, we write a(k) = ak for simplicity. Our interest is in when 
the images al, a2, ... ,an are a permutation of the numbers 1,2,...,; that is, 
each element of {1,2,...,n} occurs exactly once in the list a1, a2,...,an. In other 


words, the function a is both one-to-one and onto (a bijection).1% 


Given an integer n > 1, write X, = {1,2,...,n}. 
A permutation of X, is a bijection 0: X;, — Xn. 


We call the set S,, of all permutations of X, the symmetric group of degree n. 
Two permutations o and 7 in S,, are equal if they are equal as functions, that is, 
ifok =Tk for all k in Xp. 

To simplify the manipulation of these permutations, a matrix-type notation is 
useful. For example, if the permutation ¢: X4 — X,4 is defined by o1 = 3, 02 = 1, 
o3 = 4, and o4 = 2, we write it as 


_f1 23 4 
GO hg 4 Bes 


Here the image of each element of X4 = {1,2,3,4} is written below that element. 
In general, a permutation o € S, is written in matrix form as 


ners, ‘& a2 + a) ‘ 
Hence, a typical member of S, takes this form, where o1, 02, ... ,on is the list 
of numbers 1,2,...,n in a (possibly) different order. 


Example 1. List the elements of S3 in matrix notation. 


Solution. There are six different permutations: 


12 3 12 8 1 2g 12 3 1 2 8 1 2 8 
1 2.38772 8 1) 9X38 12 B77 \2 2 BIS @ a)? \1 8 27° 


In general, to construct a permutation 
ef Eo Bre op 
vie e ol + ) , 
we must choose the numbers ol, 02, ..., on from X,, so that they are all distinct. 
Hence, we have n choices for a1, then n — 1 choices for o2, then n — 2 choices for 


13 review of one-to-one and onto mappings can be found in Section 0.3. 


1.4. Permutations 55 
03, and so on. Thus, o can be chosen in n(n — 1)(n — 2)---3-2-1 =n! ways, which 
proves the following theorem: 
Theorem 1. The set 5, of permutations of Xy has |S,| = n! elements. 


Let o and 7 be permutations in S,. Both are mappings from X,, to Xn, and we 
write them as follows: 


Xn Xn > Xn. 
We then define the composite oT: X;, — Xy, by first applying 7 and then a: 
(or)k =o(rk), for allk Ee Xp. 
Because both o and 7 are one-to-one and onto, these properties hold for the com- 
posite or (see Theorem 3 §0.3). Hence, or is again a permutation in S,. 
Example 2. Compute or if 
_ fh 3/34 dra(t 234 
eas 4 1 >) an ae 4 3 ae 
Solution. Consider the action of or on 1: (or)l1=02=4. We can compute it 
directly from the matrix forms: 


PEE BAN FES BOB aN. FI BS A 
or=(; 41 ne 4 3 = 2 1 =) 
It is important to remember that, in computing o7, we apply 7 first and then a. 
Thus, we read 1 > 2 from the matrix for rT, then 2-5 4 from the matrix for o. 
The result is 1 75 4, as indicated. Similarly, 2 > 4% 2 leads to 2“ 2. We can read 


the entire action of o7 in this manner. The following diagrams illustrate what is 
happening: 


P o OT 
1 1 1 1 1 
2 2 2 2 2 
3 3 3 3 3 
4 4 4 4 4 
The action of or is read from the first diagram by following the arrows. O 


Note that or # To in general: If o and 7 are as in Example 2, 
ey a ae ee Oe 2 ee a i | 
ro= (5 43 He 41 eG 1 2 v4 
is not the same as or (computed in Example 2). If it happens that o7 = To, we 
say that o and + commute. Thus, two permutations need not commute (but see 
Theorem 3). On the other hand, if 0,7, and pu are three permutations in S, then 


we always have 
(or) = alr); 


which we can easily verify directly (see Theorem 3 80.3). 
The identity permutation ¢ in S,, is defined as 


ay fe ee 
EONS Soe Sek Cay” 


56 1. Integers and Permutations 


In other words, ck = k holds for every k € Xn. It is easy to verify that 
E06 =O =0€ 


holds for all o € S,, so € plays the role in S,, that 1 plays for multiplication of 
numbers. 
Consider the permutation 


in Sy. The action of o is obtained by reading down: o1 = 3, 02 = 4, 03 = 2, and 
o4= 1. There is clearly another permutation in S4 obtained by reading up 3 - 1, 
4—+2,2—3, and 1— 4. This new permutation is determined uniquely by a; In 
fact, it is the inverse of o (denoted o~! as in Section 0.3). Thus, 


ation fl2v a4 
ila owes ae ee 


In general, if o € S,, the fact that 0: X, — Xy is one-to-one and onto implies 
(Theorem 6 §0.3) that a uniquely determined permutation o~!: X, — Xp exists 
(called the inverse of 7), which satisfies 


o(o-tk) =k and o3(ck)=k, for allk € Xn. (*) 


Equations (*) imply that each of o and ao! reverses the action of the other and 
hence that we can indeed obtain the action of o~+ from 


by reading up. 


R : 123 45 67 8\, 
Example 3. Find the inverse ofo=({ err a ec at in Sg. 


1 


Solution. Reversing the action of o gives o~* = ( 


1-2: -30°4. S67 8 
if oO 


25 41 67 8 3 


If o € Sp, it is related to o-! by composition. Indeed, because the identity 
permutation ¢ in S, satisfies ek = k for all k € X,,, we can write equations (*) as 


coat 1 


oo *=e and o-o=e., 


This and other properties of composition discussed earlier are recorded in the 
following theorem for reference. 
Theorem 2. Let o,7, and u denote permutations in Sy, 

(1) of is in Sp. 

(2) ge 6 én, 

(3) o(rp) = (or)p. 

()oo Seo ‘a, 


1.4. Permutations 57 
By virtue of this, S,, is said to be a group under composition that explains the name 
“symmetric group.” Groups in general are discussed in Chapter 2. 


Example 4. Given 
eS Me ae 
c=) ae cae a 
and 
sof oo Aa 
ae 21 5 a 
find x in Ss such that yo = Tr. 


Solution. Suppose that x € S, exists such that t = xo. Multiply on the right by 
ao! to get Tro! = yoo | = xe = x. Thus, 


_ 1 {1 2345 123 4 5) /123 458 
Re NE Neg ih ae sah 345 12/ \1 5 4 3 2)° 


The reader should verify that y actually works, that is, yo = rT. O 


Let o € S, so that a: X, — Xp, is a bijection. We say that an element k € Xp, 
is fixed by o if okh=k. If ok #k, we say that k is moved by oc, and we write 
M, = {k € X, | k is moved by o}. Two permutations o and 7 are called disjoint 
if no element of X, is moved by both; that is, if M,N M, =@. 

Clearly, the identity permutation ¢ in S, is the only permutation that fixes 
every element of X,. By contrast, 


12 3 ++ n-lin 
C cr ) 
moves every element of X,,, whereas 
123 45 
(; 25 4 ) 
moves 1,3, and 5 and fixes 2 and 4. The following result is needed in the proof of 
Theorem 3. 


Lemma 1. Ifk eM, then ok € M,. 


Proof. Otherwise, ok is fixed by o; that is, o(ok) = ok. But then the fact that o is 
one-to-one gives ok = k, which is contrary to the hypothesis. | 


Theorem 3. Ifo and in S, are disjoint, then or = Ta. 


Proof. For k € Xn, we must show that (ro)k =(or)k. Since M, 1M, = © by 
hypothesis, there are three cases (see the diagram). 


14The word “lemma” means a subsidiary proposition used in the proof of another proposition. 


58 1. Integers and Permutations 


°® Case 1: k € M,. Then ck € M, too (by 
Lemma 1), so neither lies in M,. Hence, 


both are fixed by 7, so Tk=k and 
t(ak) = ok. Hence, 
(ro)k = t(ok) = ok = (Tk) = (or)k. 


® Case 2:k € M,. This case is analogous to 
Case 1, and is left to the reader. 


° Case 3: k ¢ M, andk ¢ M,. Then ok =k and rk =k, so 
(ro)k = t(ok) = Tk =k = ok = a(rk) = (or)k. 


This completes the proof. ‘| 


Note that the converse to Theorem 3 is not true. For example, co~! = o71a 


for any o in S,, but o and a7! are certainly not disjoint. Theorem 3 is important 
because it leads to a proof of the fact (Theorem 5 below) that every permutation 
in S,, can be written as a product of pairwise disjoint (and commuting) factors. We 
now turn our attention to this topic. 


Xn 


Cycles 


Consider the permutation 
fie BP Be Ah 6 
o\46325 1 


in Sg. The action of o is described graphically as 

eiee 

6 ~— 2 
Thus, the elements o moves are moved in a cycle, and a is called a cycle for this 
reason. We write o aso =(1 4 2 6). This notation lists only elements moved by 
a, and each is moved to its neighbor to the right, except the last element, which 
“cycles around” to the first. We generalize this type of permutation as follows. 


Let ky, ko,...,k, be distinct elements of X,,. 
Then, as shown in the diagram, the cycle i 


o=(ky ko ++: kr) 
is the permutation in 5, defined by ky 


oky = kiss, ifl<i<r-1l1. =e ¥ 
ok, = ky eee . 
ok =k, if k € {ky, ko,..., kr} 


We say that o has length r and refer to o as an cycle. Note that the only cycle 
of length 1 is €, that is (k) =e for each k € Xy. 


1.4. Permutations 59 
Ezample 5. Write 


ie ee ee ee: 
Te Na 2.6 <8 28 


in cycle notation. 
Solution. r = (1 4 6 2 7 3). Note that 7 fixes 5. 0 


Example 6. S3={e,(1 2 3),(1 3 2),(1 2),(1 3),(2 3)} from Example 1. 
Hence, $3 consists of cycles; however, the same is not true of 5, in general, as 
we show later. 


Example 7. The only cycle of length 1 is the identity permutation e. 


To reverse the action of a cycle, we simply go around the cycle in the opposite 
direction. Thus we obtain 


Theorem 4. If o is an r-cycle, then o~+ is also an r-cycle. More precisely, if 


o = (ky ko +++) Repay kr), then 0! = (ky kp-1 +++ ke ky). 


Cycle notation is much simpler than two-row matrix notation. However, we must 
briefly discuss two ambiguous aspects of cycle notation. First, the same permutation 
can be written in several ways in cycle notation. For example, a =(1 4 2 3) in 
S4 can be written aso = (4 2 3 1)=(2 3 1 4)=(3 1 4 2). This is harmless 
once we are aware of it. 

The second ambiguity can be illustrated as follows: Given a = (1 2 4), is it in 
Sq (fixing 3) or in Ss (fixing 3 and 5)? We introduce the following convention so 
that it does not matter. 


Convention. Every permutation in S;, is regarded as a permutation in S,41 that 
fixes n+ 1. Thus, 
SiG S2CG3C---. 


We shall adhere to this convention throughout this book. 
Of course, not every permutation is a cycle. For example, consider 


_f1 284 56789 10 
7-“\3 176104259 8 


in Sig. If we represent the action of o geometrically, we obtain 


{ a C) io D 


how 6 


The four cycles are (1 3 7 2), (4 6), (5 10 8), and (9) =e. These are pairwise 
disjoint, so each commutes with the others by Theorem 3. Even more remarkable 
is the fact that o is the product of these cycles (where we omit (9) = €): 


o=(1 3 7 2)(4 6)(5 10 8). 


The reader should check this assertion. In fact, every permutation can be expressed 
as a product of disjoint cycles in this way. Here is another example. 


60 1. Integers and Permutations 


Example 8. Factor 


12345 6789 10 li 12 13 
o-\5 122109 11 4837 1018 8 6 


as a product of (pairwise) disjoint cycles. 


Solution. Starting with 1, follow the action of 0: 13 5 > 9 + 7+ 4-1. Thus, it 
has cycled, and the first cycle is (1 5 9 7 4). Now start with any member of X13 
not already considered, say 2 + 12 + 8 > 3 — 2; so the next cycle is (2 12 8 3). 
However, 6 has still not been used. It provides the cycle (6 11 13). The remaining 
member of Xj3 is 10 that is fixed by o, so the corresponding cycle is (10) =e. 
Hence, 

g=(1 5 9 7 4)(2 12 8 3)(6 11 13) 


is the desired factorization (where we drop the 1-cycles as before). Of course, the 
action of a can be sketched as shown previously. 


‘The method of Example 8 will express every permutation as a product of disjoint 
cycles because each cycle agrees with o on the elements it moves, and these elements 
are fixed by the other cycles. In addition, the factorization is unique up to the order 
of the disjoint cycles, and we give a formal inductive proof of the following theorem 
at the end of this section. 


Theorem 5. Cycle Decomposition Theorem. If o # € is a permutation in Sp, 
then o is a product of (one or more) disjoint cycles of length at least 2. This 
factorization is unique up to the order of the factors. 


Example 9. List all the elements of $4, each factored into disjoint cycles. 


Solution. The 4! = 24 elements are as follows: 


BB oS): Gs DO ap 88 a) 
(13) (124) (1 3)(2 4) (1 2 4 3) 
(1 4) (234) (1 4)(2 3) (132 4) 
(2.3) (23 4) (1 3 4 2) 
(2 4) (1 3 2) (1 4 2 3) 
(3 4) (1 4 2) (1 4 3 2) 
(1 4 3) 
(2 4 3) Oo 


The permutations in Example 9 are classified according to the following notion: 
Two permutations in S;, have the same cycle structure if, when they are factored 
into disjoint cycles, they have the same number of cycles of each length. We refer 
to this notation again later. 


The Alternating Group 


A cycle of length 2 is called a transposition. Thus, each transposition 6 has the 
form 6 = (m_ n) where m # n. Hence, 


6*=e and 6 +=6, for every transposition 6. 


1.4. Permutations 61 


Note, however, that o =(1 2)(3 4) also satisfies c? =e and o~! =o, so these 
properties do not characterize the transpositions. 

One reason for studying transpositions is that every permutation is a product 
of transpositions. For example, the cycle (1 2 3 4 5 6) factors as follows: 


(123 4 5 6)=(1 2)(2 3)(3 4)(4 5)(5 6) 


as is easily verified. This pattern works in general. 


Theorem 6. Every cycle of length r > 1 is a product of r — 1 transpositions: 
(ky ko +++ kp) = (kt ke)(ko ka)+++(Kr-2 Kp—1)(kr-1 her). 


Hence, every permutation is a product of transpositions. 


Proof. The verification of the cycle factorization is left to the reader. The rest follows 
because every permutation is a product of cycles by Theorem 5. i 


In contrast to the factorization into cycles, factorizations into transpositions are 
not unique. For example, 


(2 3)(1 2)(2 5)(1 3)(2 4)=(1 2 4 5)=(1 5)(1 4)(1 2). 


Indeed, any factorization into m transpositions gives rise to a factorization into 
m +2 transpositions simply by inserting « = (1 2)(1 2) somewhere. This gives a 
glimpse (admittedly not convincing!) into why the next theorem is true. It asserts 
that if a permutation can be factored in one way as a product of an even (or odd) 
number of transpositions, then any factorization into transpositions must involve 
an even (respectively odd) number of factors. 

Two integers m and n are said to have the same parity if they are both even 
or both odd; equivalently, if m =n (mod 2). 


Theorem 7. Parity Theorem. If a permutation o has two factorizations 


T= n° 7271 = bm P21, ' 


where each y; and 1; is a transposition, then m and n have the same parity. 


The proof of this astonishing fact is given at the end of this section. 

A permutation o is called even or odd accordingly as it can be written in some 
way as the product of an even or odd number of transpositions. The parity theorem 
ensures that this is unambiguous, that is no permutation is both even and odd. 

The parity of a cycle y is easy to determine: Theorem 6 shows that y is even if 
its length is odd, and odd if its length is even. When combined with Theorem 5, 
this result provides a way to easily compute the parity of any permutation. 


Example 10. Determine the parity of ¢ = 6 ae aul me < 


5 46178 29 8 


Solution. The factorization of o into disjoint cyclesiso =(1 5 7 2 4)(3 6 8 9), 
Then, (1 5 7 2 4) is even and (3 6 8 9) is odd by Theorem 6, so o is odd 
(because the sum of an even and an odd integer is odd). 0 


62 1, Integers and Permutations 


The set of all even permutations in S, is denoted A,. It is called the alternating 
group of degree n and plays an important role in the theory of groups (in Chapter 
2). Theorem 8 collects several facts about A, that will be needed later. 


Theorem 8. Ifn > 2, the set A, has the following properties: 
(1) € is in A, and, ifo and T are in Ay, then both o~' and o7 are in Ap. 
(2) |An| = 3n! 


Proof. (1) € = (1 2)(1 2), so it is even. If o and 7 are even, write o = 7172°'' Yn 
and T = 6;62::-dm, where n and m are even and 4; and 6; are transpositions. 
Then of = ¥172°':n6162:::dm is a product of n+™m transpositions, and so is 
even. Finally, write p= Yn-++:Y2%1. The fact that y? =e for each 7 implies that 
op = € (verify). Hence, o-t=o te =o7!on = ep = p. But p is even because n is 
even, so a? is even. 

(2) Let O, denote the set of odd permutations in S,. Then S, = A, UO, and 
the parity theorem guarantees that An 1 O, = ©. Since |S,,| = n!, it suffices to show 
that |A,| = |O,|. We do so by exhibiting a bijection f: A, > O,. Let y= (1 2) 
and define f by f(7) = ya for allo € An. (Note that yo is odd if o is even.) The fact 
that y? =e implies that f is a bijection. In fact, yw=yo, gives o = y2o0 = 701 = 01 
(so f is one-to-one); if T € On, then o = yr € A, and f(a) = yo = 77 = 7 (so f 
is onto). Thus, |A,| = |Oz|. | 


A set of permutations is called a group if it contains the identity permutation, 
the product of any two of its members, and the inverse of any member. Hence, S), 
is a group, and the first part of Theorem 8 shows that A, is a group. The general 
idea of a group is defined and discussed at length in Chapter 2. 


Proof of the Cycle Decomposition Theorem 


Ifo # € is a permutation in S,,, we show it is a product of disjoint cycles by induction 
on n> 2. This is clear ifn = 2. If n > 2, assume that the result is true for Sp-i 
and let o € S,. If on =n, then o € S,_; and we are done. So assume on # n and 
write m = o7~!n. Then om = o(a7!n) = en =n, and m¥# n (because on #7). We 
write y = (m_n) and consider T = oy. Because 7? = €, we have Ty = 07? = ce = 0. 
Moreover, Tn = oyn =om=N, so T € Sy_-1 and 7 is a product of disjoint cycles 
by induction. There are two cases: 


° Case 1: Tm=mM. In this case, y and 7 are disjoint (as Tn = n) and we are 
done because o = 7T. 

° Case 2: 7m #m. Then m is moved by (exactly one) cycle factor of 7. Hence 
we can write 


T=pu(m ky ke ++: ky), 


where yp is a product of disjoint cycles fixing m, ki, k2,...,k, (and also fixing 
n because Tn = n). Finally, it is easy to verify that 


g=Ty=pulm ky ko +++ kp)(m n)=p(m n ky +: ky), 


which gives o as a product of disjoint cycles. 


1.4. Permutations 63 


Turning to the uniqueness, suppose that o = yq...%271 = 64°++ 6261 are two 
factorizations into disjoint cycles. We proceed by induction on max(a, b). If this is 
1, then o = y, = 61. Otherwise, let o move m. Then m occurs in exactly one 7; and 
exactly one 6;. By reordering the factors if necessary, assume that m occurs in 1 
and in 6,. Hence, we can write 


V1 = (ky ko oa ky) and 61 = (ly lp ves ls), 


where kj = m = 1,. We may assume that r < s. Then, because ky = ly, 
ko = ok, = oly = lo 
kg = akg =al, = ls 


k, = oky_1 = ol,_1 = 1, 
If r < s, the next step gives 


ly = ky = ok, = ol, => Pres 


a contradiction. Thus, r=s and 7; = 6,. If we write \=7 = 61, we obtain 
T= Ya-+. oA = 64+++ 52d. It follows that g\7! = Yq... = 5p° ++ 52 is a product of 
a — 1 (and 6 — 1) disjoint cycles. By induction, a = 6 and (after possible reordering) 
1; = 6; for i = 2,3,--- ,a, which completes the induction. 


Proof of the Parity Theorem 
The proof depends on two preliminary results about transpositions. 


Lemma 2. Let 7, # Y2 be transpositions. If y, moves k, transpositions 4, and Az 
exist such that 
Yai = 261, where 6, fixes k and \» moves k. 


Proof. Let 7, = (k a). Because 71 # Ye, the transposition 2 has one of the forms 
(k b),(a 6), or (b c) where k, a,b, and c denote distinct integers. In these cases, 


yon =(k b)(k a)=(k a)(a b) 
V1 = (a b)(k a)=(k b)(a b) 
yay = (b e)(k a)=(k a)(d oc) 

Hence the conclusion of Lemma 2 holds in every case. i 


Lemma 3. If the identity permutation « can be written as a product of n> 3 
transpositions, then it can be written as a product of n — 2 transpositions. 


Proof. Let € = Yn ++: 43271, Where n > 3 and +4; are transpositions. Suppose that 
1 moves k. Ify1 = Yo, then yo71 = €, 80 € = Yn‘ ++ Ya73 and we are done. Otherwise, 
Lemma 2 gives 172 = A261, where 6; fixes k and A2 moves k. Thus, 


E= n+ Y473A201- 


Again, we are done if Az = 3, so we let y3A2 = A3d2, where do fixes k and A3 moves 
k. Hence, 
E = Yn* ++ Y5'¥4A30001. 


64 1. Integers and Permutations 


Continue in this way. Either we are done at some stage or we finally arrive at a 
factorization 


€ = Andn—-1 +++ 6964, 
where each 6; fixes k and \, moves k. But this cannot happen because, if it did, 
k = ek = Apbn-1°+ + 6201k = Ank # k, 
a contradiction. This proves Lemma 3. o 


Proof of the parity theorem. Suppose a permutation o has two factorizations into 
transpositions: ; 


T= 7n-++Y2V1 = lm-.- Pep. 


We must show that n and m are both even or both odd. The fact that yz? = pj 
for all j gives € = ppg... -UmYn-+-Y271- Hence, it suffices to show that € cannot be 
written as the product of an odd number of transpositions. But if € is a product of 
p transpositions, where p > 3 is odd, then repeating Lemma 3 gives factorizations 
into p - 2,p—4,..., transpositions. Ultimately we get a factorization of € as one 
transposition, which is impossible. Oo 


Exercises 1.4 


Se 123 4 5 = 123 4 5 = 12 3 4 6 
SE Ne ge Be BY NS Oa a a ae ae ee gg 
be permutations. Compute: 
(a) to (b) or (c) 74 


(a) po (e) prom (f) utor 
2. (a) Verify that any two of o, r, and 4 commute: 


123 4 1234 123 4 
= T= = . 
e 4321)? 2413)? # 3 142 


(b) Do (a) by first verifying that o = 7? and p= 7°. 
3. Let 


1. Let 


and 


In each case solve for x in Sy. 
(a) ox=7 (b) xr =o (c) oY = T 
(d) xro =e (e) Txo =e (f) rye t =o 


4. Suppose that 
e 23 4 >) 
T= 
5 3 14 2 


fon) 


13. 


14. 


15. 


16. 


17. 


- ( 


1.4. Permutations 65 


and 


in Ss. If ol = 2, find o and Tr. 


. Show that 


and 


is impossible for o and 7 in S4. 


. Ifo and 7 fix k, show that or and o~! both fix k. 
. (a) How many permutations in Ss fix 1? 


(b) How many fix both 1 and 2? 


If or =e in S,, show that o = 77!. 


If o? =o in Sy, show that o =e. 


a) 
(b) 


. In S,, show that o = 7 if and only if or7! =e. 
. Ifo and 7 are disjoint in S, and or =e, what can you say about o and 7? Support 


your answer. 


. Write the following in two-row matrix notation. 


(a) (1 8 7 4)(3 6 7 5 9) (b) (1 3.5 7)(4 1 9) 


. Leteo=(1 2 3) andr=(i 2) in Ss. 


(a) Show that S3 = {e,0,07,7,T0,707} and that o3 =e= 7? and or =TOo?. 

(b) Use (a) to fill in the multiplication table for 93. 

Factor each of the following permutations into disjoint cycles, find its parity, and 
factor the inverse into disjoint cycles. 


Olea es aes 


3 
8 

4 
8 9 
oe) 
8 

5 

) 


ee Twp an 7 


6 
7 
6 
7 
3 
7 
7 


ow FKY Dw Oo 
op OF oa ep 
KF O® NDR DQ Fe 


7 
2 
8 
1 2 
OG; 
1 2 
@ (54 
(e) (1 3)2 5 78 8 5 
(f)(1 2 3 4 5)(6 7)(1 3.5 7) 6 3) 
If or = op or to = po in S,, show that r = uw. Does or = po imply that 7 = p? 
Support your answer. 
In each of (a) Ss, and (b) S¢, list one permutation of each possible cycle structure 
(see Example 9). 
Ifo=(1 2 3 
with this property. 
{a) Ifo =(1 2 3 4)(5 6 7), factor o into disjoint cycles. 
(b) If o = Y172-+:%mn, where the y; are disjoint cycles, how is the factorization of 
ao} into disjoint cycles related to the 7? Support your answer. 


won 


eed 


n), show that.o” = e and that n is the smallest positive integer 


66 


18. 


19. 
20. 
21, 


22. 
23. 


24. 
25. 
26. 


27. 


28. 


29. 


30. 


1. Integers and Permutations 


Find the parity of 


1 23 4 5 6 7 8 9 10 11 12 13 14 #15 
S-\5 1 6 1 15 18 2.9 4 10 t4 3 12 7 8 


Find the parity of each permutation in Exercise 13. 
Show that (1 2) is not a product of 3-cycles. 
(a) If y1,¥2,--+ ;Ym are transpositions, show that 

ia se Ym)7) = YnYm—1** “271: 
(b) Show that o and o~! have the same parity for all o in Sy. 
(c) Show that a and taT~1 have the same parity for all o and 7 in Sy. 
Show that Anyi Sp = An for all n > 3 (regard S,, C Sy41 in the usual way). 
Let og € S,, o#€. If n> 3, show that y € S, exists such that oy # yo. [Hint: If 
ok =I with k #1, choose m ¢ {k,l} and take y= (k m).] 
If o € S,, show that o? = « if and only if o is a product of disjoint transpositions. 
If n > 8, show that every even permutation in S,, is a product of 3-cycles. 
Let y be any cycle of length r. If o € S,, show that oyo™! is also a cycle of length r. 
More precisely, if y= (ki ko ++: k,) show that oyo-! =(ck, oko ++: aoky). 
(a) Show that (ky ko +--+ kp) =(ki kr)(ki kp-1)++- (ki ke). 
(b) Show that each o €S, is a product of the transpositions (1 2), (1 3),...,(1 7). 
[Hint: Each transposition is such a product by (a) and Exercise 26.] 
(c) Repeat (b) for the transpositions (1 2), (2 3),...,(~—1 n). [Hiént: Use (a) 
and Exercise 26.] 


(d) Ifo=(1 2 3 «:- mn), show that each element of S, is a product of the 
permutations (1 2),o, and o7!. (Hint: Use (b) and Exercise 26.] 
Leta =(1 2 3 «++ n) be acycle of length n > 2. 


(a) If n = 2k, find the factorization of a? into disjoint cycles. 

(b) If n = mq with m > 3 and q > 2, show that o™ is a product of m disjoint cycles, 
each of length gq. 

(c) If 1 <m<n, show that o“k =k+m (modn). 

(d) If n = pis a prime, show that o” is a cycle of length p for each m = 1,2,...,p—1. 
Define the sign of a permutation a to be 


sone 1 if is even 
—1 ifoisodd~ 

Prove that sgn(o7) = sgnosgnT for all o and 7 in S,. 
Consider a puzzle made up of five numbered squares in a 2 x 3 frame. Assume that 
the squares slide vertically and horizontally so that rearrangements are possible. 
For example, arrangement (2) can be obtained from (1) (in four moves). Call an 
arrangement “nice” if the lower right position is vacant. Then, the “nice” arrange- 
ments correspond to permutations in Ss. For example, arrangement (2) corresponds 


to (2 5 3). 
aE RANE 
45] 4/3) | 


Show that every “nice” arrangement corresponds to an even permutation.?> 


8In fact, every even permutation arises in this way. (See Newman, J. R., World of Mathematics, 
New York: Simon & Schuster, 1956, p. 2431.) 


1.5. An Application to Cryptography 67 
1.5 AN APPLICATION TO CRYPTOGRAPHY 


How often have | said to you that when you have eliminated the impossible, whatever 
remains, however improbable, must be the truth. 


—Sir Arthur Conan Doyle 


The ability to transmit messages in a way that cannot be recognized by adversaries 
has intrigued people for centuries. In this brief section, we outline a method that 
uses Fermat’s theorem to encode information in a way that is very difficult to break. 
The idea is based on the following consequence of that theorem. 


Theorem 1. Let n=pq, where p and q are distinct primes, write m=(p — 1)(q — 1), 
and let e > 2 be any integer such that e=1 (modm). Then 


z°=x(modn)-_ forallasuchthat — gced(z,n) = 1. 


Proof. Because e =1 (modm), write e—1=ym, where y is an integer. Then 
x° = x-(#™)¥, so it suffices to show that zc” = 1 (modn) whenever gcd(z,n) = 1. 
This condition certainly implies that p does not divide x. Hence, Fermat’s theorem 
shows that z?~* = 1 (modp) and so x™ = (x?-1)?-1 = 19-1 = 1 (modp). Similarly, 
x™ = 1 (mod gq) and so, as p and q are relatively prime, Theorem 5 §1.2 shows that 
z™ = 1 (modpq). This is what we wanted. | 


The coding process can be described as follows. Two distinct primes p and q 
are chosen, each very large in practice. Then the words available for transmission 
(and punctuation symbols) are paired with distinct integers x > 2. The integers x 
used may be assumed to be chosen relatively prime to p and gq if these primes are 
large enough and, in practice, to be smaller than each of these primes. The idea is 
to use p and qg to compute an integer r from z and then to transmit r rather than 
a. Clearly, r must be chosen in such a way that x (and hence the corresponding 
word) can be retrieved from r. The passage from x to r (called encoding) is carried 
out by the sender of a message, the integer r is transmitted, and the computation 
of z from r (decoding) is done by the receiver. 

Here is how the process works. Given the distinct primes p and gq, the cryptog- 
rapher denotes 


n=pqg and m=(p—1)(q-1) 


and then chooses any integer k > 2 such that gcd(k,m) = 1. The sender is given 
only the numbers n and k. If the sender wants to transmit an integer xz, he or she 
encodes it by reducing z* modulo n, say, 


k 


z*=r(modn), whereO<r<n. 


Then the sender transmits r to the receiver of the message who must use it to 
retrieve x. If the receiver knows the inverse k’ of k in Z,, then k’k = 1 (modm). 


Hence, Theorem 1 (with e = k’k) gives a*"* = x (modn) and 


L= oe = (a*)F = rk 
modulo n. Knowing both r and k’, the receiver can compute x (and hence the 
corresponding word in the message). 


Note that all the sender really has to know are n and k. A third party intercept- 
ing the message r cannot retrieve x without k', and computing it requires p and q. 


68 1. Integers and Permutations 


Even if the third party can extract the integers n and k from the sender, factoring 
n = pq in practice is very time-consuming if the primes p and q are large, even with 
a computer. Hence, the code is extremely difficult to break. Example 1 illustrates 
how the process works, although the primes used are small. 


Example 1. Let p= 11 and q = 138 so that n = 143 and m = 120. Then let k = 7, 
chosen so that gcd(k,m) = 1. Encode the number z = 9 and then decode the result. 


Solution. The sender reduces z* = 97 modulo n = 143. Working modulo 143: 9? = 
81, 92 =14, 94 = 126, 97 = 48. Hence, r = 48 is transmitted. The receiver then 
finds k’, the inverse of k = 7 modulo m = 120. In fact, the euclidean algorithm 
gives 1 = 120 — 17-7, so k’ = —17 = 103 (mod 120) is the required inverse. Hence, 
a is retrieved (modulo n) by « = r* = 4813 (mod 143). One fairly efficient way to 
compute this is to note that 103 = 1100111 in binary, so 103 = 1+ 2+ 27 +25 +28, 
Then the receiver computes 48’, where ¢ is a power of 2 by successive squaring of 
48 modulo 143: 


48? = 16, 48” = 113, 48° = 42, 48?" = 48, 482° = 16, 487° = 113. 
Again working modulo 143 gives 
a = 48103 = 4gi+2+2?+25+2° — 48.16.113-16-113 =9, 
which retrieves the original 9. O 


This system is called the RSA system after its inventors.1© Other, more compre- 
hensive coverage of cryptography is available,!” including overviews of the subject, 
methods, and bibliographies. 

The RSA system works by finding two large primes p and gq and computing the 
number n = pg. The code is difficult to break because it is difficult to find p and 
q given n. However, in 2002, Maninda Agrawal and two undergraduate students 
(Neeraj Kayal and Nitin Saxena) gave a simple algorithm that can decide whether 
a given integer n is prime or not. Moreover, the time taken is approximately a 
polynomial function of n. This is an important breakthrough in computer science, 
and certainly affects algorthms like the RSA system. 

Cryptography, in general, refers to the transmission of messages where the pri- 
mary aim is to disguise the message to make its interpretation by an unauthorized 
interceptor very difficult. Coding theory, in contrast, aims at fast and correct trans- 
mission of messages; we briefly discuss this topic in Sections 2.11 and 6.7. 


16Rivest, R. L., Shamir, A., and Adleman, L., A method for obtaining digital signatures and 
public-key cryptosystems, Communication of the ACM, 21 (1978), 120-126. 


1’For example, see the section on Algebraic Cryptography in Lidl, R. and Pilz, G., Applied Abstract 
Algebra, New York: Springer-Verlag, 1983. 


Chapter 2 


Groups 


Wherever groups disclose themselves, or could be introduced, simplicity crystallizes out 
of complete chaos. 


—Eric Temple Bell 


The origin of the modern theory of groups lies in the theory of equations. By 
the beginning of the nineteenth century, mathematicians had developed formulas 
for finding the roots of any cubic or quartic equation (analogous to the quadratic 
formula), and the best mathematicians of the day were trying to find such a formula 
for the quintic. It thus came as a great surprise when, in 1824, Niels Henrik Abel 
proved that no such formula exists. At about the same time, Evariste Galois showed 
that any equation of degree n has an associated group of permutations of the roots of 
the equation (that is, a set of permutations closed under compositions and inverses). 
He proved that the equation is solvable if and only if this group has a certain 
property (now called a solvable group). In particular, the fact that the group Ay, 
of even permutations is not solvable for any n > 5 implies that no formula exists 
for solving equations of degree n > 5. This spectacular achievement led to modern 
Galois theory, but Galois’ work went unrecognized until after his death at age 20. 

Galois worked with groups of permutations. Then, in 1854, Arthur Cayley 
formulated the abstract group concept. While the study of permutation groups con- 
tinues to occupy mathematicians, the abstract theory has the advantage that it iso- 
lates those properties of groups that do not depend on the underlying permutations 
and so can be applied more broadly. We pursue the abstract theory in this chapter 
(and in Chapters 7-9) with permutation groups as one of the most important 
examples. 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


69 


70 2. .Groups 
2.1 BINARY OPERATIONS 


Abstract algebra is primarily concerned with the study of operations analogous 
to the addition and multiplication of numbers. We define such operations in this 
section and examine some of their general properties. The addition process for 
numbers assigns to any ordered pair (a,b) of numbers a new number, their sum, 
denoted a+ b. Similarly, multiplication assigns the product abd to the pair (a, 6). 

In general, a binary operation * on a set M is a mapping that assigns to each 
ordered pair (a,b) of elements of M an element a * 6 of M. In this case M is said 
to be closed under the binary operation. Binary operations are usually denoted 
by other symbols (for example, + for numbers) but, for the moment, we use the 
generic notation a * b. 

A binary operation * is called commutative if a * b = b « a for all a, bin M, and 
* is called associative if a * (b * c) = (a * b) «cc for all a,b,c in M. An element e in 
M is called a unity (or an identity)!® for * ifa*e=a=exa for alla in M. The 
unity for a binary operation is often denoted by different symbols (for example, 0 
and 1 are the identities for addition and multiplication of numbers, respectively). 


Theorem 1. If a binary operation has a unity, that unity is unique. 
Proof. If e and f are both unities, then f =e f andex f =e. Soe=f. a 


A set M is called a monoid if a binary operation is defined on M that is 
associative and has an unity!®. We say that (M, +) is a monoid if the operation * is to 
be emphasized. If the operation is commutative, we say that M is a commutative 
monoid. 


Example 1. The sets Z, Q, R, C, and Z, are all commutative monoids under both 
addition and multiplication. The additive unity is 0 in all cases (0 in Z,), and the 
multiplicative unity is 1 (1 in Zp). 


Example 2. The set M,(R) of all n x n real matrices is a monoid under both 
matrix addition and matrix multiplication, the unities being 0 and J, respectively. 
The monoid (M,,(R),+) is commutative. However, (J/7,(IR),-) is not commutative 
if m > 2 (the proof of associativity is given in Appendix B). 


Example 3. If U is a set, let M = {X | X C U} denote the set of all subsets of U. 
Then (M,U) and (M,M) are both commutative monoids, the unities being © and 
U, respectively. 


Example 4. Sp, is a monoid with unity e, and it is noncommutative if n > 3 (see 
Exercise 23 §1.4). 


Example 5. If X is a nonempty set, let M = {a|a:X — X is a mapping}. Then 
’M is a monoid using composition of mappings as the operation and the identity 
mapping 1x as the unity (Theorem 3 §0.3). Moreover, M is noncommutative if X 
has at least two elements. 


18The term “identity” is often used here but it has other meanings in algebra. So we use the term 
unity. 
19 A set with an associative binary operation, but possibly no unity, is called a semigroup. 


2.1. Binary Operations 71 


Example 6. Let * be the operation defined on N by n* m = n™. This operation is 
neither commutative (2 * 3 = 8 but 3 * 2 = 9) nor associative ((2 * 3) x 2 = 64, but 
2* (3 * 2) = 512), and there is no unity (m =x * m for all m is impossible). Thus 
(N, *) is not a monoid. Note, however, that m* 1 =m for all m. 


A comment on notation is in order here. Binary operations are denoted by 
many different symbols in mathematics. For example, + and - are universally used 
for addition and multiplication of numbers, but these symbols are also standard for 
the addition and multiplication of matrices. Similarly, MN and U are well-established 
notations in set theory. When a binary operation has such a standard symbol, 
we use it along with any standard notation for the corresponding unity (as in the 
foregoing examples). However, when discussing monoids in general, we have been 
using « for the binary operation. But algebraists do not do this. They usually adopt 
one of the following two formats. 


e Multiplicative Notation. Here a * b is written as ab (or sometimes a - b) 
and is called the product of a and b. The multiplicative unity is denoted 1 
(or 1yg if the monoid M must be emphasized). 


e Additive Notation. Here a x b is written as a+ 6 and is called the sum 
of a and b. The additive unity is denoted 0 (or Oy if the monoid M must 
be emphasized). 


Multiplicative notation is the most popular format among algebraists. Hence we 
adopt the following convention. 


Convention. In dealing with monoids in general, we use multiplicative notation, 
and denote the unity by 1. 


Hence ab can mean many different things, depending on the monoid under discus- 
sion, but the meaning is nearly always clear from the context. The small amount of 
confusion is more than balanced by the simplicity and conciseness of the notation. 

For a finite monoid M, defining the operation by means of a table is sometimes 
convenient (as in Example 7 below). Given x and y in M, the product zy is the 
entry of the table in the row corresponding to x and the column corresponding to y. 
Hence, for the table in Example 7, ab = b and ca = e. The elements of the monoid 
appear in the same order across the top of the table as down the left side. Such a 
table is called the Cayley table of the monoid, honoring Arthur Cayley who used 
it in 1854. 


Example 7. If M = {e,a,b,c}, consider the binary operation shown in the 
table. The first row and column show that e is the 
unity. That the operation is commutative is also 
clear from the table because the entries are sym- 
metric about the main diagonal (upper left to lower 


right). However, this operation is not associative. 
For example a(bc) = ac = e while (ab)c = bc = c. 
If a,b,c, and d are elements in a monoid M, there are various ways to form 


the product abed—for example [(ab)c]d and a[b(cd)]. Verifying that these forms are 
equal is not difficult using associativity. In fact, we have 


72 2. Groups 


Theorem 2. General Associativity. Let a,,a2,:-- ,a,, be elements of a monoid 
M. If the product a,a2:::a, is formed (in that order), the result is the same no 
matter which bracketing is used. ; 


Proof?°. Let the standard product (a1, a2,+++ ,@n) be defined inductively by setting 
(a1) =a, and (a1,42,°+* ,@n) = a1 (@2,°** ,@n) for n > 2. Thus, (a1, a2) = aide, 
(a1, 2,3) = a;(aga3), and so on. We use strong induction on n >1 to prove 
the following statement: If p is any product of a1, @2,:++,Qn in that order, then 
p = (a1,42,:++ ,@,). This is clear if n=1 or n=2; if n=3, the only non- 
standard product is (a1a2)a3, which equals (a1, a2, a3) = a1(aga3) by associativity. 
In general, because p is formed using multiplication, it must factor as p = gr, where 
q is a product of a1,@2,°+: ,@, and r is a product of a¢41,:-- ,@, for some k with 
1<k<n-—1. Hence r = (ax41,:++ ,@n) by induction. If k = 1, then 


p= a, (a,°°* Gn) = (a1, @2,°°* yt) 
as required. If k > 1, then g = (a1,+-- ,@x) = ay (a2,...,a%) by induction, and 
p = (a1 (a2,°++ ,@k))(@k41,°°* 1 On) 
= a1((a2,°++ ,@k)(@k41,°** ,@n)) (by associativity) 
= a1 (a2,°°* , Qn) (by induction) 
= (a1, G2, 4 jn) 
which completes the proof. | 


Theorem 2 enormously simplifies notation. Is means that, in a monoid, we may 
(and do) write aja2:--@n for the product of n elements with no ambiguity. If the 
operation were not associative, we would have to be careful about which bracketing 
we use. Of course, the order of the factors in a product does make a difference if 
the operation is not commutative. 

Let a be an element of a monoid M. If n > 0 is an integer, inductively define 
the nth power a” of a as follows: 


a® = 1; a®=a-a™!, = for alla > 1. 


2 3 


Thus, a =a, a? =a-a, a? =a-a-a, and so on. The following laws, familiar for 
numbers, hold for any monoid. 
Theorem 3. Exponent Laws. Let a and b be elements of a monoid M. 

(1) a®a™ = a™t™ for alln > 0 andm> 0. 

(2) (a)™ = a” for alln > 0 and m> 0. 

(3) If ab = ba, then (ab)"” = a"b" for alln > 0. 


Proof. (1) Fix m>0 and prove (1) by induction on n>0. If n=0 then 
ea” = ieee" aot™. Th n 2 1 then "a = (da) a = ofa" a™).Since 
a™~la™ = g”1+™ py induction, this gives a"a™ = a(a™-1+™) = grt™, 


20This proof will not be used below and so may be omitted at a first reading. By contrast, the 
theorem will be used hundreds of times. 


2.1. Binary Operations 73 


(2) Fix n > 0 and induct on m using (1). If m =0, then (a”)° =1=a"°, If 
m > 1, then (a™) =". Ce = grqn(m-]) — grtn(m-1) — gam 

(3) This assertion follows by induction on n after first showing ba” = a"b for 
all n > 0 (Exercise 10). a 


It is interesting to note that, in the monoids of Example 3 (with M and U as the 
operations), a? = a for all a. Hence a” =a for all n > 1. 


Inverses 


If s is a nonzero real number, the inverse 4 is the solution to the equation xs = 1. 
In this form the idea extends to any monoid. If a is an element in a monoid M, an 
element b of M is called an inverse of a if ab = 1 = ba. An element with an inverse 
is called a unit. Note that the definition is symmetric in a and b, so that a is an 
inverse of b if and only if 6 is an inverse of a. 


Theorem 4. If M is a monoid and a € M has an inverse in M, then that inverse 
is unique. 


Proof. If both 6 and b’ are inverses of a, then ab = 1 = ba and ab’ = 1 = b’a. Hence 
f=b'1=b'(ab)=(O'a)b= 1b = 6b. a 


Note the use of associativity in Theorem 4. In fact, its use is essential: In Example 
7, both a and c are inverses of c. 

If a is a unit in a multiplicative monoid, the inverse of a is denoted a7. If the 
monoid is additive the inverse of a is denoted —a and is called the negative of a. 


Example 8. Consider the additive monoids (Z,+), (R,+), (C,+), (Zn,+), and 
(M,,(R), +). Then every element is a unit and, in all cases, the usual negative —x 
of an element z is the (additive) inverse. 


Example 9. In the multiplicative monoids (R,-) and (C,-), every nonzero element 
is a unit. However, 0 has no inverse in either case. 


Example 10. The units of (Z,-) are 1 and —1. 


Example 11. The units in (Z,,-) are the residues G, where a and 7 are relatively 
prime (Theorem 5 §1.3). 


Example 12. If M = {a|a:X — X is a mapping} under composition, the units 
in M are the bijections (onto and one-to-one mappings). (See Theorem 6 80.3.) 


Example 12 is important, and we refer to it again. If X = {1,2,--- ,n}, the set 
of units is the symmetric group S,, of degree n (Section 1.4). If X =N, we get a 
monoid containing maps o and 7 such that or = 1 but ro # 1. (Example 8 80.3.) 


Example 13. The units in (M,,(R),-) are the matrices A with det A #0, where 
det .A denotes the determinant of A.?! This is discussed in Appendix B. 


21See Nicholson, W.K., Linear Algebra with Applications, 7th ed., McGraw-Hill Ryerson, 2012 
(Theorem 2 §3.2). 


74 2. Groups 


The next theorem collects several basic properties of units that will be used 
without comment throughout the book. This theorem will be familiar to students 
of linear algebra where it is proved for invertible matrices. 


Theorem 5. Let a,b, a1, @2,°:+ ,@n—1,@n denote elements in a monoid M. 
(1) 1 is a unit and 17! =1. 
(2) Ifa is a unit so is a~!, and (a~!)“1 =a. 
(3) If a and b are units so is ab, and (ab)! = b-4a7?. 
(4) If aj, @2,°°+ ,Qn—1, Qn are units, so is a1A2+++Gn—1An, and 


-1 ~ g-1g-1 -1,-1 
(@1@2°**Qn-14n)~* = aap +++ aq az”. 


(5) Ifa is a unit so is a” for any n > 0, and (a”)~! = (a~)”, 


Proof. (1), (2), and (3) depend on the fact that, if ab = 1 = ba for some b then ais a 
unit and a~! = b. Thus (1) follows from 1 - 1 = 1; (2) follows from a~ta = 1 = aa“, 


and (3) follows if we can show that 
(ab)(b-ta-1) =1 and = 1=(b-1a~1)(ab). 


But (ab)(b-'a7!) = a(bb+)a“! = ala“! = aa! = 1. The other equation can be 
similarly verified. 

Finally (4) follows from (3) by induction on n (Exercise 16), and (5) is the special 
case of (4) where ay = a2 = ++: =Qn =a. a 


Note that every monoid has at least one unit: the unity. Moreover, if M is the 
set of all subsets of a set U, then (M,N) and (M, VU) are monoids in which the unity is 
the only unit. At the other extreme are the monoids (called groups) in which every 
element is a unit. These are the principal objects of study in this chapter. With 
this in mind, we extend the definition of nth powers to include negative powers of 
a unit. Since (a~!)” = (a")~1 by Theorem 5 for any unit a, we define the negative 
powers a~", n > 1, by 

at= (ate wake (arc. 
Then the laws of exponents extend as follows (the proof is left to the reader). 


Theorem 6. Let a and b denote units in a monoid M. 
(1) a?a™ = a™t™ for all n,m € Z. 
(2) (a")™ =a™ for alln, me Z. 
(3) If ab = ba, then (ab)” = a"b" for alln € Z. 


Exercises 2.1 


1. In each case a binary operation * is given on a set M. Decide whether it is com- 
mutative or associative, whether a unity exists, and find the units (if there is a 
unity). 

(a) M=Z;a*b=a—b 

(b) M=Q; a*b= gab 

(c) M=R;a*b=a+b-—ab 

(d) M = any set with |M|>2;a*b=b 


12. 


2.1. Binary Operations 75 


(e) M= Px Q, where P and Q are sets with |P| > 2 and |Q| > 2; 
(p,q) * (P',¢') = (p,9/) 
(f) M =N; m*n = max(m,n)—the larger of m and n 
(g) M =Nt; ab = ged(a,b) 
(h) M=RxR; (2, y) *(v',y') = (za, zy) 
(i) M=RxRxR,; (2, y,2z) * (2',y',2') = (aa', cy’ + y2’, 22’) 
Gj) M=RxRxR; (2,y,2) * (2',y',2’) = (zy, yy’, yz’) 
(a) If w,y, or z is 1, show that (xy)z = (yz). 
(b) Show that there are exactly two monoids with two elements. 
(c) Let S be a set with an associative binary operation but with no unity. Choose 
an element 1 ¢ S, write M = {1}US, and define an operation on M by using the 
operation of S and 1s = s = s1 for all s € S. Show that M is a monoid. 


a |b a 6b 
. Consider the partial Cayley tables: (1) a b and (2) a a 
b a b b 


(a) Show that there is only one way to complete table (1) so that the resulting op- 
eration is associative, and that the result makes {a,b} into a commutative monoid. 
(b) Show that there are three associative completions of table (2), two making {a,b} 
into a commutative monoid and one having no unity. 
If M is any monoid, let M denote the set of all nonempty subsets of M and define 
an operation on M by XY ={ay|x2eX, y€Y}. Show that M is a monoid, 
commutative if M is, and find the units. 
Given an alphabet A, call an n-tuple (a1, @2,--: ,@n) with a; € A a word of length 
n from A and write it (as in English) as a,a9--+@,. Multiply two words by 

(a1a2 PEG Qn) 7 (by b2 ee -bm) = 4182": *Anb1b9 eras bin; 
and call this product juxtaposition. Thus, the product of “no” and “on” is “noon”. 
We decree the existence of an empty word with no letters. Show that the set W 
of all words from A is a monoid, noncommutative if |A] > 1, and find the units. 
Given a set X and a monoid M, let F={o|o:X—-— M is a mapping}. Given o 
and 7 in F, define 0-7: X — M by (o0-7)(x) = o(2z)r (zx). Show that this definition 
makes F' into a monoid, commutative if M is, and find all the units. 
If M and N are monoids, show that the cartesian product M x N is a monoid (called 
the direct product of M and N) using the operation (m,n)(m’,n') = (mm, nn’). 
When is M x N commutative? Describe the units. 
An element e of a monoid M is called an idempotent if e? = e. 
(a) Ifa € M satisfies a” = a™*”, where m > 0 and n > 1, show that some power of 
a is an idempotent. [Hint: a™t" = a™t*r+? for all k > 1 andr > 01] 
(b) If M is finite, show that some positive power of every element is an idempotent. 
Assume that a is left cancelable in a monoid M (ab = ac implies that b = c). 
(a) If a® = b° and a!” = b'? in M, show that a = b. 
(b) If a™ = b™ and a” = b” where m and n are relatively prime, show that a = b. 


. If ab = ba in a monoid M, prove that (ab)" = a"b" for all n > 0 (Theorem 3(3)). 
11. 


An element e is called a left (right) unity for an operation if ex = x (ze = x) for 
all x. If an operation has two left unities, show that it has no right unity. 

(a) If u is a unit in a monoid M, show that au = bu in M implies that a = b. 

(b) If M is a finite monoid and au = bu in M implies that a = 6, show that u is a 
unit. [Hint: If M = {a,,--- ,a,} show that aju,...,a,,u are distinct.] 


76 


13. 


14. 


15. 


16. 


17. 


18. 


19. 
20. 


21. 


22. 


2. Groups 


If wv is a unit in a monoid M, and if av = bu implies that a = b in M, show that u 
and v are both units. 

If uv = 1, we say that u is a left inverse of v and v is a right inverse of u. If u has 
both a left and a right inverse in a monoid, show that u is a unit. 

If M is a monoid and u € M, let 0: M — M be defined by o(a) = ua for alla € M. 
(a) Show that o is a bijection if and only if u is a unit, 

(b) If u is a unit, describe the inverse mapping o7!: M —> M. 

Tf wy, Ua,++* ,Un—1,Un are units in a monoid, show that uyug+:+Un—1Un is also a unit 
and that (u1ug-+*Un-1Un)~? = Un ~+Un-1 +++ ugtuz! (Theorem 5(4)). 

Let wu and v be units in a monoid M. 

(a) If u-! =v, show that u = v. 

(b) Ifa € M and ua = au, show that uta = aut. 

(c) If wv = vu, show that u-tv7+ = vt}. 

Prove that the following are equivalent for a monoid M. 

(1) If ab is a unit then both a and b are units. 

(2) If ab = 1, then ba = 1. 

If M is a finite monoid and uv = 1 in M, prove that vu = 1. [Hint: Exercise 12(b).] 
Let M be a commutative monoid. Define a relation ~ on M by a~ b if a = bu for 
some unit wu. 

(a) Show that ~ is an equivalence on M. 

(b) If @ denotes the equivalence class of a, let M = {@| a € M} denote the set of all 
equivalence classes. Show that ab = ab is a well-defined operation on M. 

(c) If M is as in (b), show that M is a commutative monoid in which the unity I is 
the only unit. 

If M is a monoid, define E(M) = {a: M > M | a(zy) = a(z) -y for all x,y € M}. If 
a € M, definea,: M — M by a,(x) = az for alla Ee M. 

(a) Show that E(M) is a monoid under composition of mappings. 

(b) Show that a, € E(M) for allae M. 

(c) If 6: M — E(M) is defined by 0(a) = a, for all a € M, show that 0 is onto and 
one-to-one, (1) = lyr, and @(ab) = 0(a) - 6(b) for all a,b e M. 

Show that there are exactly six monoids M with three elements. If M = {1,a, }}, 
consider first the case a? = 1 (then only one multiplication table is possible). If a? = b, 
then M = {1,a,a?} is commutative and there are three monoids. Then two more 
emerge if a? = a. Note that, although associativity is used to force the multiplication 
table in every case, the associativity in the resulting table must be checked (Exercise 
2(a) is useful). 


2.2 GROUPS 


A group is a monoid in which every element has an inverse. Because of its impor- 
tance, we give the definition in full detail. A set G is called a group if it satisfies 
the following axioms. 


G1 G is closed under a binary operation. 
G2 The operation is associative. 

G3 There is a unity element in G. 

G4 Every element of G has an inverse in G. 


2.2. Groups 77 


The group G is called abelian”? if, in addition, it satisfies 
G5 The operation is commutative. 


If G is finite, the number |G| of elements in G is called the order of G. 

The terminology used for groups (and other algebraic systems, such as monoids) 
is somewhat careless. Strictly speaking, a group (G,-) consists of two things: a set 
G and a binary operation. However, common practice is simply to refer to a group 
G and not mention the operation. This practice usually causes no difficulty, because 
the operation in the group in question is understood. We adopt this loose notation 
because it is much simpler, and also to acquaint the reader with what is in fact used 
in more advanced books. When clarity is needed, we use terms such as the group 
(G,+) or the additive group G. 

Examples 1-10 indicate the variety of ways that groups can occur, and we refer 
to many of them later. We leave verification of the axioms to the reader. 


Example 1. {1}, {1,-—1}, and {1,—1,i,—7} are all abelian groups of (complex) 
numbers under multiplication. Here —1 is self-inverse, and 7 and —i are inverses of 
each other. 


Example 2. Q x {0},?3 Rx {0}, and C \ {0} are all abelian groups under multi- 
plication. In each case the inverse of an element a is a+ = 1/a. 


Example 3. The set of complex numbers 
Co = {zEC| |z|=1} = {e” |O ER} 


is a group under complex multiplica- 
tion. Here e? = cos@+isin9, as in Ap- 
pendix A, and we have ee’? = ei(@+¥) 
and (e)-1 = e~*. The group C° is called 
the circle group because it consists of 
the poirits on the unit circle. 


Example 4. For n > 1, the group U, = {z € C | z” = 1} is a group under complex 
multiplication, called the group of nth roots of unity. As in Appendix A, we have 


Uy, fe? |b 3.092; re 1. 


Clearly, U, CC® for all n > 1, and U,, U2, and U4 are displayed in Example 1. 


Example 5. The sets Z, Q, R, and C are all abelian groups under addition. In each 
case the unity is 0 and the inverse of x is —z. 


Although we write most groups multiplicatively, many important groups are 
written additively (as in Example 5). Then the unity element is denoted 0 and is 
called zero, and the inverse of x is denoted —z and is called the negative of a. 


22The name honors the Norwegian mathematician Niels Henrik Abel. 
231f X and Y are sets the set difference is defined by X\Y={xrexX|c¢Y}. 


78 2. Groups 
Example 6. If n > 2, Zp is an additive abelian group with zero 0 and the negative 
of @ being —@ = —a. We write @ = a in Z, when no confusion can result. 


Henceforth, when we refer to one of the groups Z,, Z, Q, R, or C, we mean the 
additive group. 


Example 7. The set S, of all permutations of {1,2,...,n} is a group under com- 
position (see Theorem 2 §1.4), called the symmetric group of degree n. 


The group 5S, has historical significance because such groups of bijections were 
among the earliest examples of a group. They were used by Galois in his pioneering 
work on the theory of equations. In fact Galois was the first to use the term group. 


Example 8. We single out S3 for special emphasis. Recall from Section 1.4 that 
$3 = {é, (1 2 3), (1 3 2), (1 2); (1 3), (2 3)}. 


If we denote o=(1 2 3) and7r=(1 2), then o?=(1 3 2), ro =(2 3), and 
to” =(1 3) as is easily verified. Hence we can list 53 as 


S3 = {e, 0, 07, 7, To, To}. 


The reason for doing this is that it provides an easy way to fill in the Cayley table. 
In fact, we can fill in the table by using three (easily verified) facts: 


a’ =6, T* =€6, and OTO =T. 


The resulting Cayley table is as follows 


S3 E og oF rT To TO? 
€ € o o? T TO TO 
o a o? —€ To? TT TO 
o? | o? € o To TO? 
T T TO Tor € o? 
to | ta Tor o E a 
to? | tTo* or TO ed o? E 
Note that 
ot = ote = oT(G07") = (ora)o—! = ro! = To?. 


Then, for example, we compute the product (ra)(ra7) by 


(r0)(T0*) = r(oT)o? = r(T07)o? = 7704 = cot = eo = <0. 


The other entries in the table are found in a similar manner (the reader should 
do this). The elements o and 7 are called generators for S3, and the equations 
o° =e, T? =e, and oro =7 are called relations among the generators. We often 
describe 53 in this way. Oo 


Examples 9 and 10 display two other important groups of permutations. 


Example 9. The set Ay, of all even permutations in S, is a group using the oper- 
ation of S,, called the alternating group of degree n (Theorem 8 §1.4). 


2.2. Groups 79 


Example 10. Given a (nonsquare) wire rectangle with vertices 1, 2,3, and 4 as in 
the diagram, consider the permutations of the 
vertices induced by moving the rectangle in 
space (without bending). The 180°-rotations 
about vertical and horizontal axes (see the 
diagram) give permutations 


o=(1 2)(3 4) and r=(1 4)(2 3) 


respectively. If we compute their product in 


S4 we obtain another motion or = (1 3)(2 4) because the composite motion of 
is the motion 7 followed by the motion o (the reader should verify this). Note that 
oT can also be viewed as the 180°-rotation in the plane of the rectangle about its 
center. Of course, we have another motion 70, but this is not a new motion because 
To = oT. We do get one more motion 7 = e—no 


motion at all. Hence we get a set K = {e,0,T,0T} 


: Pr : K |e a T OT 
of four motions. This is a group. It is closed because - Se ae 
o* =e,7T* =e and or-=To, and these equations o | oa E€ OT T 
enable us to fill in the entire Cayley table. Since K Dr deg eas Be 

oT |oT T Oo € 


inherits associativity from S4, it is a group because 


every element is self-inverse. The group 

is called the group of motions of the rectangle. Such groups of motions are 
important (for example they arise in the study of symmetries of molecules); we 
return to them in Section 2.7. 


Recall that a set M with an associative operation that has a unity is called a 
monoid, and that an element u in M that has an inverse u~! in M is called a unit. 
A monoid may not be a group, but its units form a group. 


Theorem 1. If M is a monoid, the set M* of all units in M is a group using the 
operation of M, called the group of units of M. 


Proof. From Theorem 5 §2.1, if u and v are units, then uv is also a unit (the inverse 
is v-tu-1), so M* is closed under the operation of M. The associativity of M* is 
inherited from M and 1 € M* (in fact 17! = 1), so M* itself is a monoid. Finally, 
if u € M*, then u~* € M* too (its inverse is u), so M* is a group. | 

Theorem 1 provides many important examples of groups. For example, the 
multiplicative groups in Example 2 are R* =R-— {0}, Q’=Q- {0}, and 
C* = C — {0}. Note also that Z* = {1,—1} and N* = {1}. 
Example 11. If X is a nonempty set, M={a|a:X—X is a mapping} is a 
monoid under composition and Theorem 6 §0.3 shows that the group M* of units 
consists of the bijections 

Sx ={a|a:X — X is a bijection}. 


The bijections X — X are called permutations of X, and Sx is the permutation 
group of X. Of course, if X = {1,2,...,n} then Sx = S,. 0 


80 2. Groups 


Example 12. Consider Z*,, where Zp, is regarded as a multiplicative monoid. Then 
Theorem 5 §1.3 gives 
Zr = {a€ Z, | gcd(a,n) = 1}. ~ 


Hence Z5 = Z, — {0} if (and only if) p is a prime. Other examples include 
1 = {1,3}, Ze= {1,5}, Zg={1,3,5,7}, and Zg = {1,2,4,5, 7,8}. 


We refer to these groups frequently. oO 


Example 13. Let R denote Zm, Z, Q, R, or C. Then the set M,,(R) of alln xn 
matrices over R is a monoid using matrix multiplication. The group M,(R)* of 
units consists of the invertible n x n matrices over R, that is the matrices such that 
det A is a unit in R—see Appendix B. It is called the general linear group of 
degree n over R, denoted GLn(R). Thus 


If R=Q, R, or C then GL,(R) = {A € M,,(R) | det A # 0}, 
GL,(Z) = {A € M,(Z) | det A = £1}, and 


GLIn(Zm) = {A € Mn(Zm) | det A = & where gcd(a,m) = 1}. 


If G1, Go,--- ,Gp are sets, recall that the cartesian product G1 x Gz x---xGp, is 
the set of all ordered n-tuples (g1, 92,--: , 9n), where g; € G; for each i. This set has 
a natural group structure when the G; are themselves groups. If G1, Ge,+-+ , Gn are 
groups, their direct product is the set G1 x Gp x --- x G, with the component- 
wise operation defined by 


(915 92)°** 9 9n) * (Gir 990°°* 1 9n) = (9194 9295,°°*  InGn) 


where g;g; is the product in G; for each 7. The routine proof of the next theorem 
is left to the reader. 


Theorem 2. If G1,Go2,---,Gn are groups, so also is G, x Gy xX --- X Gy, with 
unity (1,1,--- ,1) and inverses (91, 92,°** )9n)7 = (977,92 °5°** 19n): 
Because groups are monoids, all the properties of monoids presented in Section 


2.1 are automatically properties of groups. In particular: 


(1) The unity 1 is unique. 
(2) The inverse g~! of an element g is uniquely determined by g. 
(3) General associativity holds (Theorem 2 §2.1). 
The next theorem restates Theorem 5 §2.1 for units in monoids for reference. 


Theorem 3. Let g,h, 91,92:'** »9n-1)9n denote elements of a group G. 

(1) b=1. 

(2) (g2)1=9. 

(3) (gh)? = h-*g™. 

(4) (9192°+*Gn-19n)? = 9n' Gaia '*°92'9;° for alln > 1. 

(5) (g”)-! = (g71)” for all n > 0. 
Recall that negative powers of an element g in a group are defined by g~* = (g~1)* 
for k > 1. The next theorem is a restatement of Theorem 6 §2.1. 


2.2. Groups 81 


Theorem 4. Exponent Laws. Let G be a group with elements g and h. 
(1) g@g™ = g"*™ for all n,m € Z. 
(2) (g”)™ =g"™ for alln,m € Z. 
(3) If gh = hg, then (gh)” = g"h” for alln € Z. 
These laws are important and play a prominent role in Section 2.4. 
The assumption that every element of a group has an inverse is a very powerful 


axiom. In particular, it implies the cancellation laws, which we use countless times 
in this book. 


Theorem 5. Cancellation Laws. Let g,h, and f be elements of a group. 

(1) If gh = gf, thenh= f. (left cancellation) 

(2) Ifhg = fg, thenh=f. (right cancellation) 
Proof. If gh = gf, then left multiplication by g~1 gives (g-1g)h = (g-1g)f. Hence 
1h = 1f; that is, h = f. This proves (1), and (2) follows similarly. a 


Note that “mixed” cancellation is not valid in general: fg = gh does not imply that 
f =h. For example, in the group S3, we have (1 2)(1 3)=(1 3)(2 3) so (1 3) 
cannot be cancelled. ‘ 


Example 14. If G is a finite group and g € G, show that g” = 1 for some n > 1. 


Solution. The elements g, g’,g°,:-- in G cannot all be distinct because G is finite. 
So g™=g™t” for some m>1 and n>1. Thus g™-1=g™-g", so l1=g” by 
cancellation. O 


Another consequence of the fact that all elements of a group have inverses is 
that equations gx = h and xg = h are always solvable. 


Theorem 6. Let g and h be elements of a group G. 
(1) The equation gz = h has a unique solution x = gth in G. 
(2) The equation zg = h has a unique solution z = hg™' in G. 


Proof. If « = g~th, then gz = gg 1h =1h=h, so & is indeed a solution in (1). 
To prove that it is unique, let y also satisfy gy = h. Then gz = gy, so x = y by 
cancellation. This proves (1), and (2) follows in the same way. | 


Corollary. Every row (and column) of the Cayley table of a group G contains 
every element of G exactly once. 


Proof. If g € G, the row of the table corresponding to g consists of the elements 
gu as x ranges over G. This row contains every element h of G because gz = h is 
solvable for each h, and it contains h only once because the solution is unique. A 
similar argument applies to columns. 


A group is determined completely by its Cayley table: associativity and 
existence of the unity and inverses, which are demanded by the group axioms, 
all depend entirely on the operation. Now consider the (multiplicative) group 
Z* = {1,—-1} of units of Z and the (additive) group Z2 = {0,1}. The Cayley 


tables are 
Z* 1 -1l 
1 1 ~-1 
—1j}-i 1 


82 2. Groups 


They are the same in the sense that the Cayley table of Z* becomes that of Zs if 
we replace the symbols 1 and —1 by 0 and 1, respectively. Thus Z* and Ze are the 
same groups except for notation, and we say that they are isomorphic, or that 
they are the same up to isomorphism. We discuss this topic in more detail in 
Section 2.5; for now we prefer to treat the whole matter informally and call two 
groups isomorphic if they have the same Cayley table except for notation. As a 
result we can give an application of the Corollary to Theorem 6. 


Example 15. Show that, up to isomorphism, there is only one group G of order 
1, 2, or 3, and that group can be described in the following manner. 

e If |G| = 1, then G = {1}. 

e If |G| = 2, then G = {1,9}, where g? = 1. 

e If |G| = 3, then G = {1,9, 97}, where g? = 1. 
In each case the Cayley table is determined by the laws of exponents. 


Solution. In each case we show that there is only one way to fill in the Cayley table. 
Multiplication by 1 is prescribed. If |G| = 1, then G = {1} and the Cayley table is 
determined. If |G| = 2, let G = {1,9}, where g #1. The only entry in the Cayley 
table that is in doubt is whether g? = g or g? = 1. But g? = g is impossible because 
it implies that g = 1 by cancellation. Hence g? = 1 and the table is determined. 
Turning to the case |G| = 3, write G={li,g,h}. Then gh#g and gh#h 
by cancellation, so we must have gh = 1. Now repeated use of the Corollary to 
Theorem 6 (beginning with row 2 and column 3) gives the table on the left. 


lg 
a 
2 4 


In particular, g? =h, so G = {1,9,g7}, and g? =gh=1, as shown in the ta- 
ble on the right. This table is associative, a known realization being the group 
{e, (1 2 3), (1 3 2)} of permutations. 


The groups in Example 15 all have the rather special property that every element 
is a power of a particular element and are called cyclic groups. There exists a 
cyclic group of order n for every n > 1. Indeed the group U,, of nth roots of unity 
is cyclic of order n. In fact, if we write w = e?**/" then U, = {1,w, w?,--+, wt} 
has order n and w” = 1. 

We discuss cyclic groups in detail in Section 2.4 and treat them informally for 
now. They occur frequently, and the following generic notation is useful. Given 
n > 1, the cyclic group of order n is the group C;, of order n: 


Cued liga cca th, a” = 1, 


We write C,, = (a) in this case, and the element a is called a generator of C,. Our 
insistence that |C,,| = means that 1, a, a?,...,a”~! are distinct elements of Cp. 


The Cayley table of C,, is determined completely by the exponent laws and the 
condition a” = 1. In fact, exponents in C, can be reduced modulo n. That is, if 
k=qn+r, where 0<r<n-—1, then a® = a™ because a* = (a”)%a" = 1%a" =a’. 
In particular, 

Qj sa") Tern, 2.001, 


2.2. Groups 83 


This expression gives the Cayley table for C,, (below), and so is sufficient for all 
computations in Cy. 


Cn 1 a @ ar? gnr-l 
I 1 Q@ at «. gt? gr! 
a a a® a® «es ght 1 
2 2 3 4 
a a a a 1 a 
qr? qn? qn} 1 eed qn-4 qr-3 
grt qrt 1 a eee gn-3 qn? 


Example 16. Let Cy2 = {1,a,a7,...,a!}, a!” = 1, be a cyclic group of order 12. 
Compute a®? and a~*° in Co. 


Solution. Because 89 = 7-12+5, we get a®9 = (a!*)’a5 = 17a5 = a5. Similarly, 
—40 = (—4) -12+ 8, so a~*° = (a!*)~4a8 = 1-408 = a8. O 


Example 17. Show that b” = 1 for every element b of Cy. 


Solution. Write C, = (a) where a” = 1. Then b = a* for some k, so we have 
pr = (a*)* = gk? = grk — (a”)* =k =1, | 


Example 15 shows that every group of order 1, 2, or 3 is cyclic. However, this is 
not the case for groups of order 4. 


Example 18. Show that there are only two groups of order 4, the cyclic group C4 
and a noncyclic group K4 whose Cayley table is shown below. 
Solution. Let G = {1,a, b,c} be any group of order 4. 

The way that 1 multiplies is prescribed. Suppose 4 
first that that ab = 1. Then ac cannot be a,1 or c 
(by the Corollary to Theorem 6), so ac = b. Hence 


aragr 
a Tk Fir 
SFareRe/]a 
Rrra e|o 
eo ew me! 


a? = c, again by the Corollary. In the same way 


repeated use of the Corollary shows that the Cayley 


table is the one on the left below. In that case a? = c, a? 


so G = {1,a,a?,a*} = (a) is cyclic. 


= ca =b, anda* = c* = 1, 


Ta. by -€ Gila be 
Teas ae 05 6 1);1 abe 
ala cl b ala iloe b 
b6|b 1 ea bib ec 1 ia 
cle b ail cle b ail 


Similarly if the product of any two of a,b, and c equals 1 then G is cyclic (possibly 
with a different generator). Thus, if G is not cyclic, the product of any two of a,b, 
and c must equal the.third (for example, bc # b,c, or 1, so be = a). Hence we get 
the Cayley table on the right as required. O 


The group Ky = {1,a,b,c} in Example 18 is called the Klein group.7* The 
multiplication can be described as follows: a? = b? = c? = 1, and the product of any 
two of a,b, and c is the third. 


24The name honors Felix Klein. This group is also called the four group. 


84 2. Groups 

If you are nervous because we have not shown that Ky is associative, you can 
relax. The (associative) group Zg = {1, 3,5, 7} has exactly the Cayley table of K, if 
we write a = 3, b=5, andc=7. Another instance of K4 is the permutation group 
K = fe, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}. Example 18 shows that there are 
two groups of order 4: the cyclic group and the (noncyclic) Klein group. The reader 
should try to show that every group of order 5 is cyclic; in fact, if p is any prime, 
we show in Section 2.6 that every group of order p is cyclic. 


Exercises 2.2 


1. In each case either show that G is a group with the given operation or list the axioms 
that fail. 
(a) G=N; addition 
(b) G = {2n | n € Z}; addition 
(c)G=R;a-b=at+b+4+1 
(d) G=R;a-b=a+b—ab 
(e) G= {e,(1 2),(1 3),(1 4)}; operation in 94 
(f) G = {0, 2,4, 6}; addition in Zs, 
(g) G = {16, 12, 8, 4}; multiplication in Zao 
(h) G = {q € Q| q > 0}; multiplication 
(i) G= {a : NN |c is one-to-one}; composition 


(j) G = {a, b,c, d}; multiplication given by 


2. If Gis a group, let G°? denote the set G with a new multiplication given by ao b = ba. 
Show that G°? is a group. 
3. In each case fill in the Cayley table, given that G = {1,a,b,c,d} is a group. 


(as) Glia bcd (b) Gili a bed 
1);1i a6 ed ITilabse4d 
ala 1 6b ala 
b | b b | b c d 
cle cele 
d\d did 


A, Is the empty set a group? Explain. 

5. If M is a monoid, describe an easy way to determine whether M is a group by 
looking at the Cayley table. 

6. If U is a set, let G={X |X CU}. Show that G is an abelian group under the 
operation @ defined by X BY =(X\Y)U(Y \ X). 


laob 
7. Show that the set G= 0 le 
0 01 


a,b,c in x| is a group under matrix 


multiplication. 
8. In each case show that G is a group using the operation of 54, and determine how 
many elements o of G satisfy 0? =e. 


10. 


11. 


12. 


13. 


14. 


15, 


16. 
17. 


18. 
19. 
20. 


21. 
22, 


23. 


24, 
25. 
26. 
27. 
28. 


29. 


30. 


2.2. Groups 85 


(a) G={e,(1 2)(3 4), (1 3)(2 4), (1 4)(2 3)} 

(b) G={e,(1 2 3 4), (1 3)(2 4), (1 4 3 2)} 

Let g=(1 2 3 4 5 6) in Sg. Show that G= {e,¢,07,¢3,04,0°} is a group 
using the operation of Sg. Is G abelian? How many elements 7 of G satisfy r? = €? 
73 =e? 

(a) If at = 1 and ab = ba? in a group, show that a= 1. 

(b) If a& = 1 and ab = ba in a group, show that a? = 1 and ab = ba. 

(c) If a& = 1 and ab = ba? in a group, show that a° = 1 and aba = b. 

(a) If (ab)” = 1 in a group where n > 0, show that (ba)” = 1. 

(b) Extend (a) to all n € Z. 

Let G be a group of order 4. Assume that 1,a, and b are distinct elements of G and 
that a? = 1 and b? = 1. Show that G = {1,a,b,ab} and fill in the Cayley table. 

If G is any group, define a: G > G by a(g) = g7}. Show that a is onto and one-to- 
one. 

Given a, b, and cin a group G, show that the equation a~!xb = c has a unique solution 
weG. 

Let a € G where G is a group. If X C G is a finite subset, write Xa = {xa | « € X}. 
Show that X and Xa have the same number of elements. 

If fgh = 1 in a group G, show that ghf = 1. Must gfh = 1? 

Recall that an element e in a monoid is called an idempotent if e? = e. Describe all 
the idempotents in a group G. 

If G is a group and g,h € G, show that gh = hg if and only if g-*h-+ = h-4g7}. 
Show that a group G is abelian if and only if (gh)~! = g-h7! for all g and hin G. 
Show that a group G is abelian if g? =1 for all g € G. Give an example showing 
that the converse is false. 

Show that a group G is abelian if and only if (gh)? = gh? for all g and h in G. 
Show that a group G is abelian if (gh)* = g°h3, (gh)* = g*h4, and (gh)® = g°h® for 
all g andhinG. 

Let g be an element of a group G. 

(a) Show that g? = 1 if and only if g-+ = g. 

(b) If |G| is finite and even, show that g # 1 in G exists such that g? = 1. 

Let a and b be elements of a group G. Prove that (aba~!)* = ab*a™" holds for all 
k € Z (including negative k). 

If a® = 1 and a~4ba = b™ in a group, prove that b™°~! = 1. [Hint: Exercise 24.] 
Show that every cyclic group C;, of order n is abelian. 

Show that the additive group Z,, is cyclic. 

Let a and b be elements of a group G. If a” = b" and a™ = b™ where gcd(m,n) = 1, 
show that a = b. [Hint: Theorem 4 §1.2.] 

Let G be a set with an associative operation defined on it. In each case show that 
G is a group. : 

(a) There is a left unity e (eg =g for all g in G), and each element g has a left 
inverse (hg = e for some h in G). 

(b) G is finite and both cancellation laws hold. 

(c) Both gz = h and zg = h are solvable in G for all g and hin G. 

(d) For all g and h in G, gu = h has a unique solution in G. 

If G is an abelian group with n elements, show that g"=1 for every g € G. (Hint: 
See the proof of Theorem 8 §1.3.| 


86 2. Groups 
2.3 SUBGROUPS 


Many important groups arise as subsets of known groups. Therefore, we are inter- 
ested in knowing which subsets H of a group G are themselves groups (with the 
same operation). Thus a subset H of a group G is called a subgroup of G if H 
itself is a group using the operation of G. For example, (Z,+) is a subgroup of 
(R, +). However the multiplicative group (Q*,-) is not a subgroup of (IR, +), even 
though Q is a subset of R, because the operations are different. 


Example 1. If G is any group, both {1} and G are subgroups of G. The sub- 
group {1} is the trivial subgroup of G. Any subgroup other than G is a proper 
subgroup. 


Example 2. Each of the additive groups ZC QC RCC is a subgroup of the 
larger ones. 


Example 3. Ap is a subgroup of Sy. 
Example 4. C° = {z € C | |z| = 1} denotes the circle group, then each of 


{1,-1} € {1, -1,%,-7} C C° C C* 
is a subgroup of the larger ones. 


In each of these examples, the subgroups of a group G not only have the same 
operation as G, but they also share the same unity element and the same inverses. 
This observation is true in general and, in fact, provides a very useful test for when 
a subset of a group is actually a subgroup. 


Theorem 1. Subgroup Test. A subset H of a group G is a subgroup if and only 
if the following three conditions are satisfied. 

(1) lg € H, where 1g is the unity of G.2° 

(2) Ifh € H and hy € H, then hh, € H. 

(3) Ifh € H then h~ € H, where h~' denotes the inverse of h in G. 
In this case, H has the same unity as G and, if h € H, its inverse in H is the same 
as its inverse in G. 


Proof. If H satisfies (1), (2), and (3), then H is closed by (2), the unity of G is the 
unity for H by (1), and the inverse in G of an element h € H serves as the inverse 
of h in H by (3). As H inherits the associative law from G, it is a subgroup. 
Conversely, if H is a subgroup, let e denote the unity of H. Then e? =e=e-lg, 
so € = lg by cancellation in G. This proves (1), and (2) follows because H is closed 
under the operation of G. Finally, if h € H, let h’ denote its inverse in H. If h~+ 
is the inverse in G, then hh’ = 1=hh7!, so h! = h7} by cancellation in G. This 
proves (3) and the last sentence in the theorem. | 


Theorem 1 is useful as the conditions are easily checked (see also Exercise 2). 


Example 5. If R is one of Z, Q, R or C, let H = {A € Mo(R) | det A = 1}. Show 
that H is a group using matrix multiplication, called the special linear group. 


25To avoid confusion, we sometimes denote the unity of a group G by 1g when other groups are 
present. 


2.3. Subgroups 87 


Solution. We have H C M2(R)*—see Example 13 §2.2—so we show that it is a 
subgroup of M2(IR)*. We have I € H because det[ =1. If A and B € H, then 
det(AB) = det Adet B= 1-1=1 and det A? = 1/det A= 1/1 = 1. These results 
show that AB € H and A7! € H, so the subgroup test applies. 0 


Example 6. If n > 0, write nZ = {nk | k € Z}. Show that nZ is a subgroup of Z. 


Solution. The unity of Z is 0, and 0 =n-0€ nZ. If a and b are in nZ, write them 
as a=nk and b=nm, where k€Z and me Z. Then a+b=n(k+m) and 
—a = n(—k) both lie in nZ, so nZ is a subgroup of Z by the subgroup test. O 


Theorem 2. Finite Subgroup Test. If H is a finite nonempty subset of a group 
G, then H is a subgroup of G if and only if H is closed (h,h, € H implies hh, € H). 


Proof. If H is closed, let h € H. Then each of h, h?,h°,--- is in H so, because H 
is finite, they cannot all be distinct. Hence h” = h”*™ for somen >1landm>1. 
This means 1 = h” by cancellation, so 1 € H by hypothesis. But then 1 = h™ Th 
implies that h-! = h™~-1, so h7! € H, too. Because H is closed by hypothesis, it is 
a subgroup by Theorem 1. The converse is clear. | 


Example 7. Determine all subgroups of the Klein group Ky = {1,a,},c}, where 
a? = b? = c? = 1 and the product of two of a,b, and c is the third. 

Solution. Each of H, = {1,a}, Hy = {1,6}, and H, = {1,c} is a subgroup by 
Theorem 2, because a? =b*=c?=1. Any subgroup H with |H|>3 must 
contain two of a, b, and c and so contains the other one (their product). Thus, H = G 
and the complete list of subgroups is {1}, Ha, Hp, He, and G. O 


Example 8. Determine all subgroups of C4 = {1,a,a7,a°}, at =1. 


Solution. Let H ={1,a?}. Then H is a subgroup by Theorem 2 because 
(a”)? = a* = 1. Suppose that K is a subgroup distinct from {1} and H. Then either 
a€ K ora® € K. Ifa K, then (because K is closed) each power a,a”, and a° is 
in K, so K = C4. Similarly, H = C, if a? € H because, as the reader can verify, 
Cz = {1, a°, (a?)?, (a9)3}. Thus the subgroups are {1}, H = {1,a7}, and Cy. O 

It is descriptive to draw the lattice diagram of all subgroups of a group G. 
Here the subgroups are shown in such a way that a line can be drawn up from K 
to H whenever K C H. The diagrams for K4 = {1,a,b,c} and for a cyclic group 
C4 = {1,a, a, a3} of order 4 are given below. 


C4 wi Ky 
{1, a7} {la} {1,0} {1,c} 
a {1} 
If G is any group, the center of G is defined”® by 


Z(G) ={z€G| zg = gz for all g € G}. 


26The notation Z(G) comes from zentrum, the German word for center. 


88 2. Groups 


The elements in Z(G) are said to be central in G. 
Theorem 3. If G is any group, then Z(G) is an abeliangsubgroup of G. 
Proof. Use the subgroup test. Clearly 1 € Z(G). If z € Z(G), then zg = gz for all 
g €G, so multiplying this equation on the left by z~! gives g = z7!gz. Then 
multiplication on the right by 27! gives gz~! = z~1g. Thus z~! € Z(G). Finally, 
if both y and z lie in Z(G), then, for all g € G, 

(yz)9 = y(zg) = y(9z) = (y9)z = (gy)z = gly2). 
Thus, yz € Z(G), so Z(G) is a subgroup. It is clearly abelian. | 


Observe that Z(G) = G if and only if G is abelian. At the other extreme, it can 
happen that Z(G) = {1} so G is as far from abelian as it can be. In fact ye have: 


Example 9. If n > 3, show that Z(S,) = {e}, where ¢ is the identity permutation. 
Solution. If co € Syn, o #€, we must find tr € S, such that or # To. Because o # «, 
choose k and m in Xy = {1,2,-+- ,n} such that oh =m+#k. Because n > 3, let 
l,k, and m be distinct, with 1 € X,, and take r to be the transposition rT = (k 1). 


Then (to)k = 7m =m and (oT)k = ol, so it suffices to show that ol #4 m. But if 
ol =m then ol = ck, sol = k because a is one-to-one, a contradiction. O 


We now turn to two important ways of manufacturing new subgroups from old 
ones. The straightforward proof of Theorem 4 is left as Exercise 16. 


Theorem 4. Let H and K be subgroups of a group G. then their intersection 
HnK={g¢€G|g¢H andge Kk} 
is also a subgroup of G. 


Note that HK is a subgroup of both H and K, and is the largest such subgroup 

in the sense that if X is a subgroup of both H and K then X C HN K. Incidentally, 

the union H U K of two subgroups is almost never a subgroup (see Exercise 17). 
The next theorem introduces another important type of subgroup. 


Theorem 5. Let H be a subgroup of a group G. If g € G, then 
gHg* = {ghg* |h ¢ H} 
is a subgroup of G. These subgroups are called the conjugates of H in G. 
Proof. Clearly, 1 = glg~! isan element of gHg™. Given ghg™!, where h € H, 
(ghg™*)* = (gt) *h-1g7? = gh'g7! € gHg™. 
Finally (ghg7!)(ghig7!) = g(hhi)g~* for any h, hy in H, which shows that gHg7! 
is closed. Thus it is a subgroup by the subgroup test. a 


If H is a subgroup of G, then H = 1H1"', so H is always a conjugate of itself. 
If H is the only conjugate of H in G (that is, gHg~! = H for all g € G), then H is 
said to be self-conjugate (or normal) in G. These subgroups play a fundamental 
role in group theory, and will be investigated in detail in Sections 2.8, 2.9, and 2.10. 
Example 10 displays a subgroup that is not self-conjugate. 


3 2 


Example 10. Let $3 = {e,0,07,7,To,T07}, where o and oto =T. 


Find the conjugates of the subgroup H = {e,7}. 


— i aes 


2.3. Subgroups 89 


-1 1 


Solution. Clearly eHe~! = H. Since o~! = o? and oro =T, we get 


oHo = {ceo ora} = {e,07T07} = {e, ro}. 


Similarly, c?Ho~? = {e,707}. These are all the conjugates of H in G (verify). O 


Exercises 2.3 


1. In each case determine whether H is a subgroup of G. 
(a) H = {0,1,-1},G=Z 
(b) = {1, 3}, G=Z% 

(c) H= {1, 3}, G=Zi, 
(d) H={e,(1 2 3)}, G=S3 
(e) H= {e, (1 2)(3 4), (1 3) (2 4)}, G= S83 
10 -1 0 0-1 Ol 
om (hf. ea) Ls] joe om 
(g) H = {2, 4,6}, G=Ze 
(h)H=N,G=Z 
(i) H={(m,k)|m+k is even}, G=ZxZ 

2. If H is a subset of a group G, show that H is a subgroup if and only if H is nonempty 
and ab! € H whenever a € H and be H. 

3. If K is a subgroup of H, and # is a subgroup of G, must K be a subgroup of G? 
Justify your answer. 

4. Let X =R\{0,1}. Show that G= {e,d1,A0, 41, 2,43} is a subgroup of Sx 
if e(e) =2, a(n) =1/(1—a), dala) = (@-1)/2, paw) =1/2, pala) =2/(2 - 0), 
and ji3(v) = 1—a, for alla € X. 

5. (a) If G is an abelian group, show that H = {a € G | a? = 1} is a subgroup of G. 
(b) Give an example where H is not a subgroup. 

6. (a) If G is an abelian group, show that H = {g? | g € G} is a subgroup of G. 

(b) Give an example showing that the converse of (a) is false. 
(c) Show that H is not a subgroup if G = Ay. 

7. (a) If G is a group and g € G, show that (g) = {g* | k € Z} is a subgroup of G. 
(b) If G is finite, show that {g* | k € N} is a subgroup of G for all g EG. 

8. If X is a nonempty subset of a group G, let (X) be the set of all products of powers 
of elements of X; more formally 


(X) = {abtah?...ckm |m>1, a; © X and k; € Z for each ti}. 


(a) Show that (X) is a subgroup of G that contains X. 
(b) Show that (X) C H for every subgroup H such that X C H. 
Thus, (X) is the smallest subgroup of G that contains X, and is called the subgroup 
generated by X. 
9. If G is a group and g EG, define C(g) = {z € G| zg = gz}. Show that C(g) is a 

subgroup of G (the centralizer of g in G). 

10. Let X C {1,2,---,n} be a nonempty set. Show that {o€S,|ok=k for all 
k € X} is a subgroup of S,,. 


11. Let = { E | abeRiad o}. Show that G is a subgroup of GLo(R). 


90 


12. 
13. 


14, 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 
23. 


24, 


25. 


2. Groups 


Showtime = { ‘; ‘| | be R} is a subgroup of GL9(R). 


(a) If G is a group, show that {(g,9) | 9g € G} is a subgroup of G x G. 
(b) Determine the groups G such that {(g, 97) | g € G} is a subgroup of G x G. 
If X is an infinite set, let G be the set of all permutations o in Sx such that or = & 
for all but a finite number of elements x of X. Show that G is a subgroup of Sx. 
In each case determine all subgroups of G and draw the lattice diagram. 

(a) @=Cs (b) G = Cs (c) G= Ss (d) G=Z5 
Let H and K be subgroups of a group G. 
(a) Show that HK is a subgroup of G (Theorem 4). 
(b) Show that HK is the largest subgroup contained in both H and K in the 
sense that it contains every subgroup contained in both H and K. 
If H and K are subgroups of a group G, show that H U K is a subgroup if and only 
ifH CK or K CH. 
If a and b are real numbers, define t.,: R—-R by 72,,(2) = av-+b for all ceER. 
Show that G = {ta,, | a,b € R, a $ 0} is a subgroup of Sg. 
Let H and K be subgroups of a group G and let g EG. 
(a) If Gis abelian, describe the conjugates of H in G. 
(b) Show that (gHg~') N(gKg™) = 9(HNK)g"t. 
(a) If H is a subgroup of G and H C Z(G), show that H is self-conjugate in G. 
(b) Let S3 = {e,0,07,7,T0,Ta7}, where o% 2 and oro =7T. Show that 
H = {e,¢,07} is self-conjugate in G. 
irG= { i: ‘| Jace Ra #0,c #0} find Z(G). 
Find Z(GL2(R)]. 
Can a group G have an abelian subgroup not contained in Z(G)? Defend your 
answer. 
If ab = ba in a group G, let H = {g € G | agb = bga}. Show that H is a subgroup of 
G. 
If H and K are subgroups of G, define HK = {hk |he H,k € K}. Show that 
HK is a subgroup if and only if KH C HK. 


=E=>=T 


2.4 CYCLIC GROUPS AND THE ORDER OF AN ELEMENT 


We have already introduced the cyclic groups C,, n > 1, but discussed these groups 
only informally. Recall that C,, has the form C, = {1,a,-+- ,a"~1}, where a” = 1, 
so C, consists of powers of A. In this section, we classify groups consisting of all 
powers of a particular element and determine all subgroups of such groups. This 
endeavor is important because these groups are building blocks for all sufficiently 
“small” abelian groups (including all finite ones). 


We begin by showing that the set of all powers of an element of a group G is an 


important subgroup of G. 


Theorem 1. Let g be an element of a group G and write 


(9) ={g" | ke Z}. 


2.4. Cyclic Groups and the Order of an Element 91 
Then (g) is a subgroup of G, and (g) C H for every subgroup H of G with g € H. 


Proof. Clearly, 1 = g° € (g). If x,y € (g) , write them as z = g*, y = g™. Then the 
exponent laws give zy = g**™ € (g) and 2! = g-* € (g), and the subgroup test 
applies. Finally, g = g' € (g), and if g €¢ H where H is a subgroup then (g) C H 
because g* € H for all integers k. a 


Hence, if g € G then (g) is the smallest subgroup of G containing the element g. 

If g is an element of a group G, the subgroup (g) = {g* | k € Z} is called the 
cyclic subgroup of G generated by g. If G = (g) for some g € G, we say that 
G is acyclic group and that g is a generator of G. Thus, the generic cyclic group 
Cy, = {1,a,-+- ,a"~*}, a = 1, is cyclic in the present sense, so the terminology is 
consistent. 


Example 1. If G is any group, {1} = (1) is a cyclic subgroup of G. 


Example 2. The group G = {1,—1,1, —t} is cyclic. In fact, 7? = —1 and i3 = —i 
show that G = (2). Similarly, G = (—7) , so both 2 and —# are generators. But —1 
is not a generator, because all positive and negative powers of —1 are either 1 or 
—1. Hence (—1) = {1, —1} is not all of G. 


If a group X is written additively, recall that the unity element is denoted 0 and 
the inverse of x € X is denoted —z. The exponent zx” (in multiplicative notation) 
becomes nz here, so the cyclic subgroup generated by z is 


(c) = {ka | k € Z} = Ze 
consisting of the multiples of x. The laws of exponents translate as follows: 


grtm — 2 ¢™ becomes (n + m)x = nx +mz, 
(x")"™ =a2"™ becomes m(nz) = (mn)z, 
and if z and y commute 
(zy)” = x"y” becomes n(x + y) = nz + ny. 
Here are two important examples of cyclic additive groups. 


=, 
Example 3. Show that (Z,+) is cyclic and that 1 and —1 are the only generators. 


Solution. If keé Z then k=k-1€ (1), so Z= (1). Similarly, Z = (—1) because 
k = (—k) - (—1). Clearly nZ # Z, ifn #1 and n # —1 (for example n+1 ¢ nZ). 


Example 4. Show that (Z,,+) is cyclic with generator 1. 


Solution. We have Zp, = {0,1,2,--- ,n—1} where, for the moment, we revert to 
the formal k notation for residue classes. Given k in Z,, note that k= k1 isa 
multiple of 1, and so k € (1). It follows that Z, = (1), as required. 0 


Example 5. In the multiplicative group R* of nonzero real numbers, the cyclic 


group (8) ={--- an 4,97 1,3, 9, 27, 81,---} consists of all the powers (positive, 


zero, and negative) of 3. Note that these powers are all distinct in this case. 


92 2. Groups 


Example 6. Consider the group Z? = {1,2,3,4,5,6}. Here are the powers of 2: 


9-5 9-4 9-3 9-2 9-1 90 91 92 93 : 94 95 96 
2 4 1 2 4 ul 2 4 JT 2 4 I 


If the elements in the bottom row are read left to right they “cycle” endlessly 
through the sequence 1,2,4 (this is the source of the term cyclic group). Clearly 
(2) = {1,2,4}, and the reason that (2) has three elements is that 3 is the smallest 
positive integer n such that 2” = 1 in Z>. 


Order of an Element 


These examples point to one of the most useful concepts in group theory. Let g 
be an element of a group, and suppose that g* = 1 for some integer k # 0. Since 
g-* =1 also holds, we may assume that k > 1, so the well-ordering principle 
guarantees that there is a smallest integer n > 1 such that g” = 1. This integer n 
is called the order of g, and is denoted o(g) = n. If no such integer n exists we say 
that g has infinite order and write o(g) = oo. To sum up: 


1. If g* = 1 for some k # 0 then o(g) =n is the smallest integer such 
that n >1 and g” =1. 


2. If g* =1 only if k = 0 then o(g) = oo. 


In particular, in Example 5, 0(3) = oo in R*, while in Example 6, 0(2) = 3 in Z. 
Note that the unity element 1 is the only element of order 1 in any group. 


Example 7. Find the order of each element in Zj = {1,3,5, 7}. Is Z& cyclic? 


Solution. We have 0(1) = 1. Since 3? = 9 = 1 in Z}, it follows o(3) = 2. Similarly, 
o(5) = 2 and o(7) = 2. Hence no element of Zi has order 4, so Z is not cyclic. O 
Example 8. Find o k oa in GL2(Z). 

| . Then A? = 


Solution. Write A= F 


. i” and A=[F "| anh 


1 ==] Oo -1 


so A® =I. Since A* # I and A® # I, we conclude that o F e 6: Oo 


Example 9. If y = (kikp:--k,) is a cycle in S,, then o(y) =r is the length of +. 


Solution. If the integers ki,ko,...k, are uni- k, 
formly placed on a circle, the cycle - moves Kyle —~ Os, ky 
: : ae : kyo 
each integer one position clockwise, as shown ky 
in the diagram. Hence y”,7°,... carry each in- 
teger 2,3,... positions clockwise, respectively, 
so y* #e for 1<n<r-—J1, whereas 7” =e. 
This means that o(y) =r. O 


Example 10. Show that o(g~!) = o(g) for any group element g. 


Solution. If k € Z then (g~')* = (g*)—1, and it follows that (g~!)* = 1 if and only if 
g* = 1. Hence the smallest positive integer n (if any) such that g” = 1 is the same 
as the smallest positive integer n such that (g7+)” = 1. That is, o(g~+) = o(g). O 


2.4. Cyclic Groups and the Order of an Element 93 


Example 11. If G is a finite group, show that every element g € G has finite order. 


Solution. Since G is finite, the powers g, g”, 9°, --+ are not all distinct, so let g* = g™ 
with k <_m. Then g™~* = 1 where m —k > 0 so 0(g) is finite. O 


Computing the order of an element is simplified by the next theorem 


Theorem 2. Let G be a group and let g € G satisfy o(g) =n. Then 
(1) g* =1 if and only if n|k. 
(2) g* = g™ if and only if k =m (modn). 
(3) (g) = {1,9,9?,-+» ,g™ 1+} where 1,9,97,:-- ,g"7! are all distinct. 


Proof. We use the laws of exponents. 

(1) If nk, say k=qn, then g* =(g")?=1%=1. Conversely, if g* =1, 
write kK=qn+r with 0<r<n (division algorithm). But then we have 
g” = 9*(g")-4 = 1(1)~4 = 1. Since r < n, this contradicts the minimality of n unless 
r=0.Sor=0 and nik. 

(2) We have g* = g™ if and only if g*-™ = 1. Now apply (1). 

(3) Clearly, {1,9,9?,-+- ,g"~+} © (g). To prove the other inclusion, let x € (g) , 
say z = g*. As before, write k = qn +r, where 0 <r <n—1. Then 


z= g* = (g")%g" = 19g" =g" € {1,9,97,-°- ,g" 4}, 


which shows that (g) C {1,9,97,---,g” +}. Hence (g) = {1,9,g?,--» ,g"~*}. To 


complete the proof, suppose two of 1,g,g?,--- ,g"~! are equal, say g* = g™, where 
0<k<m<vn. Then g”™* =1 and 0<m-—k <n. This implies that m—k =0 
by the minimality of n, so g”™ = g*. Thus 1, g,9?,-:: ,g"7! are distinct. | 


Theorem 2 asserts that if o(g) =n, then g* = 1 if and only if n|k. The following 
example illustrates how useful this is. 


Example 12. Find the order of 2 in Zig. 


Solution. We compute in Zj9: 2° = 8, so 29° =64=7 and 2° =56 =—1. Hence 
218 = 1, so o(2) divides 18 by Theorem 2. This, o(2) is 1,2,3,6,9, or 18. We have 
already eliminated 3, 6, and 9, so as 2'=2 and 2? =4, the only possibility remaining 
is o(2) = 18. Note that, since |Zjg| = 18, this shows that Zi, is cyclic and that 2 is 
a generator. : O 


The next result is the “companion” of Theorem 2 for elements g with o(g) = oo. 


Theorem 3. Let G be a group and let g € G satisfy o(g) = oo. Then 

(1) gk = 1 ifand only ifk =0. 

(2) g* = g™ if and only ifk =m. 

(3) (g) ={--- ,977,97*,1,9,97,--+}, where the g* are distinct. 
Proop’ (1) Clearly g? = 1. -g* 1k 4-0, theneg* = 9")? = 1-! =. too: 
Hence g” = 1 for some n > 0, which implies that (g) is finite, contrary to hypothesis. 
Thus g* = 1 implies that k = 0. 

(2) We have g* = g™ if and only if g*-™ = 1. Apply (1). 

(3) (g) = {g* | k € Z} by definition, and these powers are distinct by (2). # 


94 2. Groups 


If o(g) =n, then | (g)| =n too, by (3) of Theorem 2, so o(g) =| (g)| in this 
case. Since this also holds if o(g) = 00, we have shown that our two uses of the 
word “order” are compatible. 


Corollary. We have o(g) = | (g)| for every element g of any group. 


We now use Theorem 2 to derive an elegant formula for the order of any per- 
mutation o in S,. Recall that o factors (uniquely) as a product of disjoint cycles 
7: (Theorem 5 §1.4). The order of o turns out to be the least common multiple of 
the orders of the cycles +; (which are the lengths of the y; by Example 9). 


Theorem 4. Let o be a permutation in S; with factorization 0 = Y12°++‘r into 
disjoint cycles. Then |o| = lem(o(71), 0(y2), +++ , 0(4r))- 


Proof. Write n = o(0), ni = 0(%), and m = lem(n1, n2,-++ ,n,-). As m|m for each i, 
we have 77" =e, and so o™ = y["yq":: -y7" =e (because the 7; commute). Hence 
n|m by Theorem 2. To show that m|n, it suffices to show that y? =e for each 4 
(then n;|n by Theorem 2 so m|n by the definition of the least common multiple). 
We show that yj = e; the others are similar. This requires proving that yk = k for 
all0 <k <n. This is clear if k is fixed by 71, so let k be moved by 7. Then k is fixed 
by each of y2,--- , Yr, because the +; are disjoint. Thus, since e = 0" = yf7F--- 7", 
we have 
k=ek= (ow )R= WOE Ie )k = Pk. 

It follows that yf? = €, as required. al 


Example 13. Find the order of 


ga(t 23 4 5 6 7 8 9 10 11 12 13 14 
“\5 7 9 14 10 11 12 8 313 2 64 «421 


Solution. Herea =(1 5 10 13 4 14) (2 7 12 6 11) (3 9) is the cycle factor- 
ization, so Theorem 4 gives o(a) = lem(6, 5, 2) = 30. 0 
The next result will be used several times below. 
Theorem 5. Let o(g) =n for g in some group. If d\n, d > 1, show that o(g*) = 4. 
Proof. Write 4 =k for convenience. Then (g*)* = g” = 1, so we must show that 
k is the smallest such positive integer. Suppose (g*)" = 1, r > 1. Then g® =1 so 
njdr by Theorem 2, say dr = qn, q > 1. But then dr = q(dk), so r = gk because 
these are integers and d # 0. It follows that r > k, as required. i 


Other Properties of Cyclic Groups 


Theorem 6. Every cyclic group is abelian, but the converse does not hold. 


Proof. Let G = (g) be cyclic with generator g. If 2,y € G, write 2 = g", y= 9", 
where k,m € Z. Then the exponent laws give 


ry = g’g™ = ght yar grote a gg? = 2, 


so G is abelian. However Z§ is abelian but not cyclic by Example 7. | 


2.4. Cyclic Groups and the Order of an Element 95 


As the proof of Theorem 6 illustrates, computations in a cyclic group depend 
entirely on the exponents of the generator. As these exponents are integers, the 
facts about Z derived in Chapter 1 turn out to be useful. In particular, the division 
algorithm plays a natural role in the proof of Theorem 7. 


Theorem 7. Every subgroup of a cyclic group is cyclic. 


Proof. Suppose that G = (g) = {g* | k € Z} is cyclic and let H be a subgroup of 
G. If H = {1}, then H = (1) is cyclic. Otherwise, let g* ¢ H, k #0. Because H 
is a subgroup, g~* = (g*)~! € H, and so we may assume that k > 0. Hence let m 
be the smallest positive integer such that g™ € H. Then (g™) C H, and we claim 
this is equality. To see this, let g* € H and write k= qm+r,0<r<m, by the 
division algorithm. It suffices to show that r=0 (then g* = (g™)% € (g™)). But 
g’ = (g™)~4g* € H, which contradicts the minimality of m unless r = 0. | 


A cyclic group G = (g) can have other generators, for example, G = (Cae 
Theorem 8 explicitly describes all generators of a finite cyclic group. 


Theorem 8. Let G = (g) be a cyclic group, where o(g) =n. Then G = (g*) if and 
only if gcd(k,n) = 1. 


Proof. lf G = (g*) , then g € (g*) say g= (gry where m € Z. Thus g' = g*™, so 
n divides 1 — km by Theorem 2. Then 1 — km = qn for q € Z; that is, 1 = km-+ qn, 
which implies that gcd(k,n) = 1. Conversely, if ged(k,n) = 1 then 1 = xk + yn for 
some integers x and y by Theorem 4 §1.2. Hence 


g= 9 = (g*) - (9")¥ = (9°) - (1)¥ = (9*)* € (9*), 


which implies that G = ( g*) : | 


Hence, for example, if o(g) = 12 the generators of G = (g) are the powers g® 
where ged(k, 12) = 1, that is g,g°,g’, and g™. In particular, the generators of the 
additive cyclic group Z12 are the residues 1,5,7, and 11. 

Theorem 9 below gives a complete description of all subgroups of a finite cyclic 
group G. In particular, it shows that G has a unique subgroup of order k for every 
divisor k of n, and that these are the only subgroups of G. 


Theorem 9. Fundamental Theorem of Finite Cyclic Groups. Let G = (g) 
be a cyclic group of order n. 
(1) If H is a subgroup of G, then H = (g*) for some d\n. Hence |H| divides n. 


(2) Conversely, if k 
Proof. (1) Theorem 7.implies that H = (g™) for some m. Let d= gcd(m,n); we 
show that H = (g?). We have d|m, say m= qd, so g™ = (g*)? € (g%), when 
AC (9") . On the other hand, d= 2m + yn, for some x,y € Z, so 


n, then (g™*) is the unique subgroup of G of order k. 


gf = (9™)*- (9")¥ = (9™)*(1)¥ = (9™)* € (9) = H. 


Hence (g*) C H, so (g*) = H. But then, |H| = 4 by Theorem 5, so |H| divides n. 


96 2. Groups 


(2) Suppose that K is any subgroup of G of order k where kin. By (1) let 
= ae where d\n. Then Theorems 2 and 5 give k = |K| = o(g%) = 3. It follows 
that d= %,so K = (gr/ k\ . This proves (2). a 


If G is mie and cyclic and H is a subgroup, part (1) of Theorem 9 shows that |H| 
divides |G|. In fact, this result is true for any finite group G, cyclic or not. The 
general result is called Lagrange’s theorem, which we prove in Section 2.6. 


Example 14. Find all subgroups of C2 and draw the lattice diagram. 


Solution. Let Cip=(g), o(g) =12. The Che 
divisors of 12 are 1,2,3,4,6, and 12. Using 

Theorem 5, the unique subgroup of each of 7 NS 
these orders is, respectively, (9?) (9°) 


{1} = (g'?), (9°), (9*)s (9°) (97) 


and (g) =G. (9*) (9°) 


The lattice pat is as shown at the right. 
Note that (g™) C (g*) if and only if klm. O (1) 


We am of the cyclic subgroup G = (g) as being generated by the single element 
g. We conclude this section with a brief discussion of subgroups generated by more 
than one element. 


Theorem 10. Let X be a nonempty subset of a group G and let 
0.4 a {af ak? . --akm | oi € X, ky EZ, m> 1} 


denote the set of all products of powers of (not necessarily distinct) elements of X. 
Then 

(1) (X) is a subgroup of G containing X. 

(2) If H is a subgroup of G with X C H, then (X) C H. 


Proof. (1) Choose a € X eee X #@). Then 1 = 2° € = (X) The set (X) is 
clearly closed and, if g = af a8? -.-akm is in (X), then g™} Gt ge Pay K ig 
also in (X). Hence (X) is a Phan of G by the siberotip test. 

(2) If X CH and g = afk? --- ak is in (X), then each 2 is in H because 


x; € X C H and H is a subgroup. Hence g € H, proving (2). a 


Thus, if X is a nonempty subset of a group G, the subgroup (X) in Theorem 10 is 
the smallest subgroup of G that contains X (in the sense of (2) of Theorem 10). 
Hence (X) is called the subgroup generated by X. If G has the form G = (X) for 
some X C G, we call X a set of generators for G; if X is finite, we say that G is 
a finitely generated group. 

Obviously, (g) = ({g}) , so the cyclic groups are exactly the subgroups generated 
by singleton subsets. Similarly, it is customary to write 


({915 92; sean 19n}) S (91, 92, min In) 
for finitely generated groups. 


Example 15. Consider the symmetric group $3 = {e,0,07,7,70,To7}, where 
|o| = 3, |r| = 2, and or = 7o?. Then 53 = (o,7). 


2.4. Cyclic Groups and the Order of an Element 97 


Example 16. The Klein group K4 = {1,a,b, ab} is generated by any two nonunity 
elements. 


Exercises 2.4 


1. 


2. 


10. 


11. 


12. 


13. 


14. 


15. 
16. 


17. 


18. 
19. 


Find all generators of the cyclic group G = (g) if 


(a) o(9)=5 = (b) og) =10 = (c) o(g) = 16 = (d) o(g) = 20 
Find all generators of 
(a) Zs (b) Zio (c) Zig (d) Zao 


. Find all generators of 


(a) G = (g) , where o(g) = 00 (b) Z 


. In each case determine whether G is cyclic. 


(a) G=Z; (b) G=Zi, (c) G= Zig (d)G=Z, 


. (a) Is Q* cyclic? Justify your answer. 


(b) Is Q cyclic? Justify your answer. 


. If G is a group and g € G, show that (g) = (g~+). 
. Let o(g)= 20 in a group G. Compute 


(a) o(g*) — (b) og") (©) ofg*) (A) of") 


. (a) Find an element of maximum order in S5. 


(b) Find an element of maximum order in Sy. 


. In each case find all subgroups of G = (g) and draw the lattice diagram. 


(a) o(g)=8 — (b) o(g)=10 — (c) o(g) = 18 

(d) o(g)= p®, p is a prime. 

(e) o(g) = pg, p and q are distinct primes. 

(f) o(g)= pq, p and q are distinct primes. 

(a) If gh = hg in a group and o(g) and o(h) are finite, show that o(gh) is finite. 
(b) Show that (a) fails if gh # hg by considering B 2 and & | ; 

Let G be a cyclic group of order n. 

(a) Show that g” = 1 for all g EG. 

(b) If g™ = 1 in G where gcd(m,n) = 1, show that g = 1. 

Let g =e°* in U,. Show that o(g) =n. 

(a) If G = {g1, 92,°+: , gr} is an abelian group, let a = gigo+++g,. Show that a?= 1. 
(b) Prove Wilsons Theorem: (p — 1)! = —1 (modp) if p is a prime. [Hént: Zj] 
Suppose that G is a group in which {1} and G are the only subgroups. Show that G 
is finite and, in fact, is cyclic of order 1 or a prime. 

Show that (a,b) = (a,ab) = (a~', b~+) for all a and b in a group G. 

In each case, find the subgroup H = (z,y) of G. 

(a) G = (a) is cyclic, x = a4, y =a? 

(b) G = (a) is cyclic, z = a8, y = a8 

(c) G = (a) is cyclic, gz = a™, y=a*, gcd(m,k) =d 

(d) G=S3,2=(1 2), y=(2 3) 

(e) G = (a) x (6), o(a)=4 = 0(b), x = (a,b), y = (a,b) 

(f) G= (a) x (0) , o(a)= 4, o(b)= 6,0 = (a,b), y= (a, b°) 

(a) If X CY in a group, show that (X) C (Y). 

(b) Show that a nonempty subset X is a subgroup if and only if (X) = X. 

lf G = (g) and H = (h), show that G x H = ((g,1),(1,A)). 

If G= (X) and zy = yz for all x,y € X, show that G is abelian. 


98 


20. 


21. 


22. 


23. 


24, 


25. 


26. 


27. 


28. 


29. 
30. 


31. 


32. 


33. 


34. 


35. 


36. 


37. 


2. Groups 


(a) Find three elements of Cg x Cis of maximum order. 

(b) Find one element of maximum order in C,, x Ch. 

Find the smallest positive integer n such that o” =e for every o € Ss. 

If o € S, and o(c) = p is a prime, show that o is a product of disjoint p-cycles. 

(a) Show that o(h) = o(ghg™*) for all g,h € G. [Hint: Example 10.] 

(b) Show that o(gh) = o(hg) for all g,h € G. [Hint: Example 10.] 

(a) If h is the only element of order 2 in a group G, show that h € Z(G). [Hint: 
Exercise 23(a).| 

(b) If a is the unique element of order 3 in G, what can you say about a? 

Let G and H be cyclic groups, with |G| =m and |H| =n. If gcd(m,n) = 1, show 
that G x H is cyclic. [Hint: If G = (g) and H = (h), use Theorem 5 §1.2 to show 
o((g, h))= mn] 

Let o(g) = m and o(h) = n in a group G, where m and n are relatively prime. 

(a) If gh = hg, show that o(gh) = mn. Is o(gh) = lem(m,n) in general? [Hint: Theo- 
rem 5 §1.2.] 

(b) If o(a) = mn, show that a=gh=hg for some g,h EG with o(g) =m and 
o{h) =n. [Hint: Theorem 4 §1.2.] 

Let G = (g) be a cyclic group and let A = (g*) and B= (g°) be cyclic subgroups. 
(a) If o(g) = oo, show that A C B if and only if a = gb for some g € Z. 

(b) If o(g) = n, show that A C B if and only if a = gb (modn) for some g € Z. 

Let H be a subgroup of a group G and let a€ G, o(a) =n. If m is the smallest 
positive integer such that a™ € H, show that mJn. 

If o(g) =n, show that o(g*) = n/d, where d = gcd(n,k). [Hint: Proof of Theorem 9.] 
Let G = (g) where o(g) = n. Given g* € G, show (g*) = (9?) , where d = ged(k,n). 
[Hint: Theorem 3 §1.2.] 

Let G = (g) be a cyclic group and let A = (g*) and B= (g°) : 

(a) If o(g) = 00, show that AN B = (g™) , where m = lcm(a, b). 

(b) If o(g) =n, assume (Theorem 9) that aln and bln. Show that AN B= (g™), 
where m = lem(a, b). 

Show that the following conditions are equivalent for a finite group G. 

(1) G is cyclic and |G| = p", where p is a prime and n > 0. 

(2) If H and K are subgroups of G, either H C K or K CH. 

[Hint: For (1) = (2) use Theorem 8,] 

If a group G has a finite number of subgroups, show that G must be finite. 

Prove the Chinese Remainder Theorem. Let 1, n2,:-:- ,n, be positive integers, 
relatively prime in pairs. Given integers m1, ™m2,--+ ,m,, show that there exists m € Z 
such that m; =m (modn,) for each i. [Hint: Extend Exercise 25 to r groups.] 

(a) Let o(a) = m and o(b) = n in a group G. If ab = ba, show that an element c € G 
exists, with o(c) = lcm(a, b). [Hint: Theorem 10 §1.2, Theorem 8, and Exercise 26(a).| 
(b) Let G be an abelian group and assume that G has an element of maximal order 
n (always true if G is finite). Show that g” = 1 for all g € G. [Hint: Part (a). 

Let m be the smallest positive integer such that o” =e for all o € S,. Show that 
m = lem(2,3,4,5,+++ ,n). 

For a deck of 2n distinct cards, a “perfect shuffle” means cutting the deck into two 
equal halves and collating them as follows: If the cards were originally in the order 
1,2,3,4,--- ,2n, they end up in the order 1,n+1,2,n+2,---,n,2n. In each case, 
determine the number of perfect shuffles required to bring the deck back into its 
original order. 


2.5. Homomorphisms and Isomorphisms 99 


(a) n = 4,5,6, and 7 (b) n = 8,9, and 10 
(c)n=12 (d) n = 26 (a regular deck) 


2.5 HOMOMORPHISMS AND ISOMORPHISMS 


Mathematicians do not deal in objects, but in relations among objects; they are free to 
replace some objects by others so long as the relations remain unchanged. Content to 
them is irrelevant: they are interested in form only. 


—Henri Poincaré 


Up to this point we have ignored mappings from one group to another. The inter- 
esting ones are those that preserve the group multiplication in the following sense: 
If G and H are groups, a mapping a: G — H is called a homomorphism?’ if 
a(ab) = a(a)-a(b) for alla andb inG. 
Of course the product ab here is in G while a(a) - a(b) is in H. 
Homomorphisms arise in many forms as the following examples illustrate. 


Example 1. The mapping a: Z— Z given by a(a) = 3a is a homomorphism of 
additive groups because a(a + 6) =3(a + b) = 3a + 3b = a(a) + a(b) for all a, b € Z. 


Example 2. If a is an element of a group G, define the exponent map a: Z > (a) 
by a(k) = a* for all k € Z. Then a@ is a homomorphism because (as the operation 
in Z is addition) 

a(k +m) =akt™ = aka™ =a(k)-a(m) for allk,m eZ. 


Example 3. Let Rt denote the group of positive real numbers under multiplica- 
tion. The absolute value map a: C* > R™ given by a(z) = |z| for all ze C* is a 
homomorphism (in fact, onto) because |zw| = |z||w| for all z,w € C. 


Example 4. Let GL,(IR) denote the general linear group of n x n invertible 
matrices over R. The determinant map GL,(R)— R* given by Ar det A is 
a homomorphism because det(AB) = det A det B for all matrices A and B, and 
det A # 0 if A is invertible. 


Example 5. The identity map 1g: G— G is a homomorphism for any group G 
because 1¢(ab) = ab = 1g(a) - 1¢(b) for all a,b in G. 


Example 6. For groups G and H, there is at least one homomorphism from G to 
H, the trivial homomorphism a: G — H defined by a(g) = 1 for all g € G. 


Example 7. If a:G—H and 6: H > K are homomorphisms, show that the 
composite map fa: G— K is also a homomorphism. 
Solution. This is because, for all a and 6 in G, 


Ba(ab) = Bla(ab)] = B[a(a) - a(b)} = Bla(a)] - Bla()] = Bala): Bald). 0 


By definition, a homomorphism a:G — H is a mapping that preserves the 
operation in the sense that a(ab) = a(a)a(b) for all a and b in G. Theorem 1 shows 
that a is “structure preserving” in the sense that it also preserves the unity, inverses, 
and powers. ; 


?7Homomorphisms were first used explicitly (for permutation groups) by Jordan in 1870. 


100 2. Groups 


“Theorem 1. Let a: G— H be a group homomorphism. Then 


(1) a(le) = 1a. (a preserves the unity) 
(2) a(g7+) = a(g)~? for allg EG. ' (q@ preserves inverses) 
(3) a(g*) =a(g)* for allg€ G andk eZ. (a preserves powers) 


Proof. (1) Here a(1g)-a(1¢) = a(12,) = a(1lg) = a(1g) +1. Now cancel in H. 
(2) From (1), a(g7*) - a(g) = a(g-1g) = a(1) = 1, which gives (2). 
(3) For k = 0, a(g°) = a(1) = 1 = [a(g)}®. If (3) holds for some k > 0, then 


a(gt*) = a(gg*) = a(9) -a(9") = a(g) -fa(g)}* = fa(a)]**. 


Hence (3) holds for all k > 0 by induction. If k <0, write k = —m, m > 0. Then 
(2) and the preceding calculation give 


a(g*) = af(g™)*] = [a(g™)}* = [a(g)™)* = [a(g)-™ = [ag |*- 


Thus [a(g)]* = a(g*) for all k € Z. a 


Corollary 1. Let a: G— H be a homomorphism. If g € G has finite order, then 
a(g) also has finite order, and o(a(g)) divides o(g). 


Proof. If o(g) =n then g” = 1, so a(g)” = a(g”) = a(1) = 1. Hence o(a(g)) divides 
n by Theorem 2 §2.4. | 


Corollary 2. If a:G-— H is a homomorphism, write a(G) = {a(g) | g € G}. 
Then a(G) is a subgroup of H. 


Proof. This follows from the subgroup test because of the following observations: 
ly =a(1g) € a(G); a(g)a(gi) = a(9g1) € a(G), and a(g)"* = a(g"*) € a(G). & 


The group a(G) in Corollary 2 is called the image of a. Note that a: G — 4H is 
onto if and only if a(G) = H. 
Example 8. Let a: G— H be an onto homomorphism. 

(1) If G is abelian show that H is abelian. 

(2) If G = (a) is cyclic show that H is cyclic and H = (a(a)). 


Solution. Let h, hi € H. Since a is onto, write h = a(g) and hy = a(gi), 9,91 € G. 
(1) If G is abelian: hh = a(g)a(gi) = a(ggi) = a(gig) = a(gi)a(g) = hi. 

(2) Let G = (a). If hE H, say h=a(g), let g =a", k € Z. It suffices to prove 

that h € (a(a)). But h = a(a*) = a(a)* € (a(a)) , as required. O 


Let G and H denote groups. In order to show that two mappings a: G— H 
and 6:G-— H are equal, we must verify that a(g) = B(g) holds for all g EG. 
However, if a and @ are homomorphisms, this need only be checked for all g in 
some generating set for G—see Theorem 10 §2.4. 


Theorem 2. Let a: G— H and 8: G— H be homomorphisms and assume that 
G = (X) is generated by a subset X. Then 


a=6 ifandonlyif a(z)=f(x) forallae X. 


2.5. Homomorphisms and Isomorphisms 101 


Proof. If a = 8, the condition is obvious. If the condition holds, let g € G and 
write (Theorem 10 §2.4) g=a}tak?...ck, where x; € X and k; € Z for each i. 
Then Theorem 1 gives 


ou(g) = ax(r1) o(2)*? ++ au(ain)*™ = B(21)** B(ae)* ++ B(an)* = Bg). 


As g € G was arbitrary, this shows that a = £. @ 


Theorem 2 shows that a group homomorphism a: G — H is completely deter- 
mined by its effect on a generating set for G. This is useful because many groups 
are generated by a relatively small number of elements. 


Example 9. Show that there are at most six homomorphisms 53 — Cg. 


Solution. As in Example 8 §2.2 we write $3 ={1,o,07,7,70,T07} where 
o(c) = 3, o(r) = 2, and oro =7, and write Cg = (c) , o(c) = 6. Hence S3 = (0,7), 
so Theorem 2 shows that a homomorphism a ; S3 — Cg is determined by the choice 
of a(c) and a(r) in Cg. Now a(c)? = a(o%) = a(e) = 1, so o(a(c)) is 1 or 3. 
Hence there are three choices for a(c): 1,c?, or c*. Similarly, a(r)? = 1, so a(r) 
must be 1 or c®. Thus, there are at most 3-2 = 6 choices in all for a(c). 0 


We hasten to note that not all the choices in Example 9 correspond to actual 
homomorphisms. In fact, there are only two homomorphisms from $3 to Cg, and 
we return to this example later (see Example 9 §2.10). 


lsomorphisms 


We have shown that there are two distinct groups of order 4: the cyclic group and 
the noncyclic Klein group. Determining how to distinguish between distinct groups 
leads to the notion of isomorphic groups. Roughly speaking, the two groups are 
isomorphic if they are the same except for notation. 

As an illustration, consider the groups G = {1,—1} and Zi = {1,3}. The two 
Cayley tables are 


CoP Be A <8 
tid = ie ies 
a i es ee 3/3 1 


Clearly, they are alike. In fact, because the way the unity multiplies is always 
specified, we can describe both by saying that the nonunity element squares to 1. 
Here is a more precise comparison: Consider the mapping 0 : G > Zj given by 


o(1)=1 and = o(-1)=3. 


Then a is a bijection, and we can obtain the entire Cayley table for Zi, from that 
of G by replacing a with o(a) for every a in G. In other words, the two groups are 
the same except for notation; we obtain Z4 from G by changing symbols. 

This works in general. If G and H are groups and 0: G > H is a bijection, we 
ask when the Cayley table for H results from applying o to every element of the 
table for G. Looking at the diagram : 


the required condition is o(ab) = a(a)o(b) for all a and b in G. In other words, o 
must be is a homomorphism. This leads to the following definition. 
If G and G, are groups, a mapping 


o:G— H is called an isomorphism 
if o is a bijection (one-to-one and onto) which is also a homomorphism. When an 
isomorphism exists from G to H we say that G is isomorphic to H and we write 


Gd. 


Hence, if ¢o : G — H is an isomorphism, the group H is just G with the change 
of notation g + o(g). As in the preceding illustration, G and H are the same group 
except for the symbols used. It is useful to think of isomorphic groups as two 
different realizations of the same (abstract) group.?® 


Example 10. The set 2Z = {2k | k € Z} of even integers is an additive group, in 
fact a subgroup of Z. Show that Z = 2Z. 


Solution. The function o : Z — 2Z given by o(k) = 2k for all k € Z is clearly onto, 
and o is one-to-one because o(k) = a{m) implies 2k = 2m, so k = m. Finally, o is 
a homomorphism: 


o(k+m) = 2(k +m) = 2k + 2m = o(k) + o(m) 
for all k and m in Z. Thus o is an isomorphism, so Z & 2Z. O 


Note that the argument in Example 10 shows that Z & nZ for any integer n $ 0. 


lon 


Example 11. If G= { lc A 
multiplication, and that G & (Z, +). 


||n € z\, show that G is a group using matrix 


Solution. Define o : Z > GL2(R) by a(n) =|) * [for all n in Z. Then g is clearly 
one-to-one, and it is a homomorphism because 


o(m+n) =[ mr l= | ls 1 [= e(m) (7). 


Hence G = o(Z) is a subgroup of GZ2(R) by Corollary 2 of Theorem 1. Moreover, 
gd is a bijection Z > G =o(Z), so a: Z > G is an isomorphism. O 


Clearly, G © G for any group G (the identity map G — G is an isomorphism). 
However, even though two groups are isomorphic, they sometimes appear to be quite 
different. As a remarkable example, the group C* of all nonzero complex numbers 


28The term isomorphism comes from isos, meaning equal, and morphe, meaning shape. 


2.5. Homomorphisms and Isomorphisms 103 


is known to be isomorphic to the circle group C° of complex numbers on the unit 
circle.?° Here is a less spectacular example. Recall that R+ = {r € R| r > 0}. 


Example 12. Show that R & R*, where R is additive and Rt is multiplicative. 


Solution. Define o : R — R* by o(r) = e”, where e® is the exponential function. To 
show that o is one-to-one, let o(r) = o(s), where r,s € R. Then e” = e® so, if Inz 
denotes the natural logarithm of z, r = In(e”) = In(e*) = s. Thus a is one-to-one. 
Ift € Rt, then t > 0, so Int € R and o(Int) = e™* = ¢. Hence a is onto. Finally, 


o(r+s)=e"t8 =e"e® =a(r)-o(s),  forallr ands inR, 
which shows that o is an isomorphism. O 
Example 18. If G&G, and H & Hj, show that G x G, = H x Aj. 


Solution. Let 0: G— G and 7: H — Hy, be isomorphisms, and define a mapping 
u:Gx HG, x A, by w(g,h) = (o(g),7(h)). This is a homomorphism because 


L(g, 2)(9',h')] = (gg! hh’) = [o(99'), r(hh’)] 
= [o(9), Aloo), 7A)] = wh) - u(g', B’). 
for all (g,h) and (g',h’) in G x H. The proof that 7 is onto and one-to-one is left 
to the reader. O 


Verifying that a particular mapping is an isomorphism requires checking three 
things: that it is onto; that it is one-to-one; and that it is operation-preserving. 
Even if a particular mapping a : G — H may fail one of these tests, the groups G 
and H may very well be isomorphic (for example r +> r + 1 isa bijection R > R, but 
it is not an isomorphism). Conversely, showing that G and H are not isomorphic 
entails showing that no isomorphism exists from G to H. Examples 14 and 15 
illustrate this situation. 


Example 14. Show that Q is not isomorphic to Q*. 
Solution. Suppose that a : Q > Q* is an isomorphism. Then a is onto, so let gE Q 
satisfy o(q) = 2, and write o($9) =a. Since o is a homomorphism, we have 
a? = o(5q)-0(99) = 0(59 + 99) = 0(g) = 2. 
This is impossible because a € Q (Example 3 80.1), so no such o can exist. oO 


Example 15. Let G and H be cyclic groups with |G| = 9 and |H| = 3. Show that 
G and H x H are not isomorphic, even though both groups have order 9. 


Solution. Suppose that 0: H x HG is an isomorphism. If G = (a) where 
o(a) =9, let a=o(z) with c¢€HxH. Then z?=1 (this holds in H) so 
a® = a(x?) = 1, a contradiction. Hence H x H $ G. O 


The reason that H x H #G in Example 15 is that, while 2? =1 for every 
element x of H x H, this is not the case for G. The condition 2? = 1 for all z isa 
property of the Cayley table of H x H but not of the Cayley table of G. The fact 
that group isomorphisms preserve such properties is the reason that H x H is not 


2°See, for instance, Clay, J.R., The punctured plane is isomorphic to the unit circle, Journal of 
Number Theory, 1, (1964), 500-501. 


104 2. Groups 


isomorphic to G. More generally, we can often show two groups are not isomorphic 
by exhibiting such a property that holds in one but not the other. 


Theorem 3. Let G,H, and K denote groups. 
(1) The identity map 1g : G > G is an isomorphism for every group G. 
(2) Ifo: G — H is an isomorphism, the inverse mapping o~! : H — G is 
also an isomorphism. 
(3) Ifo:G— H andt:H — K are isomorphisms, the composite map 
ta: G— K is also an isomorphism. 


Proof. (1) is clear. 

(2) The inverse mapping o~! : H > G exists because oa is a bijection, and a7! 
is also a bijection (Theorems 5 and 6 §0.3). So it remains to show that o~+ is a 
homomorphism. If g; and hy are in Gi, write g = o~!(g;) and h =o~+(hy). Then 
o(g) = gi and o(h) = hy, so 


o*(gihi) = 7 *[o(g) oh) = o*[o(gh)] = gh = o7*(g1) -o-* (ha). 


Therefore o~! is a homomorphism, and hence is an isomorphism. 


(3) The map 7c is a bijection by Theorem 3 §0.3; now apply Example 7. 2 


Corollary 1. The isomorphic relation = is an equivalence for groups. That is 
(1) G&G for every group G. 
(2) IfG2H then H&G. 
(83) fG2H andH=K thnG= kK. 


Proof. (1), (2), and (3) follows from the corresponding items in Theorem 3. | 


As an illustration of Corollary 1, we show that if G and H are both cyclic of 
order n then G @& H. Indeed G=Z, and H =Z, by Example 13, so G& H by 
Corollary 1. 


Automorphisms 
If G is a group, an isomorphism G — G is called an automorphism of G. 


Corollary 2. If G is a group, the set of all automorphisms G — G forms a group 
under composition. 


Proof. The automorphisms G — G are a subset of the group Sg of all bijections 
G — G, and Theorem 3 shows that they are a subgroup of Sg by the subgroup 
test. @ 


The set of all automorphisms of G is called the automorphism group of G, and 
is denoted aut G. 


Example 16. If G is abelian, the mapping o : G > G defined by o(g) =g7+ for 
all g € G is an automorphism of G. We leave the verification as Exercise 10. 


If G is a group and a € G, define a mapping a, : G - G by 


1 


Ja(g) =aga* forallgeG. 


2.5. Homomorphisms and Isomorphisms 105 


This map o, is an automorphism of G (see Example 17 below), called the inner 
automorphism of G determined by a. Note that if H C G is a subgroup then 
oa(H) = aHa™! is a conjugate of H. 
Example 17. If G is any group and a € G, show that 

(1) For each a € G, oq is an automorphism of G. 


(2) If 6:G—autG is defined by 0(a)=o, for each a€G, then @ is a 
homomorphism, that is ogy = Gad» for all a,b eG. 


(3) The image 0(G) = {o, | a € G} of @ is a subgroup of aut G. 
Solution. (1) We leave as Exercise 11 the verification that og is a bijection for all 
aéG. Ifg,h eG we have 
Ja(9) -Oa(h) = aga - aha~+ = aglha“! = agha" = o4(gh). 


Hence o, is an automorphism of G, proving (1). 
(2) We must show that o40, = Gap for a,b € G. But for any gEG: 


a0o(9) = Fa(bgd"*) = a(bgb*)a~* = (ab)g(ab)“* = oas(9). 
(3) This follows from (2) and Corollary 2 of Theorem 1. 0 
In Example 17, the group 6(G) of all inner automorphisms G is denoted 
innG = {o,|a€ G}. 


The group innG is an important subgroup of autG, and it is easily described 
because each inner automorphism a, is given explicitly in terms of a. By contrast, 
the group aut G can be difficult to determine. We do one simple case in Example 
18 below.— 


Because it is a homomorphism, every isomorphism preserves the unity, inverses, 
and powers. But isomorphisms also preserve the order of an element (compare with 
Corollary 1 of Theorem 1). 


Theorem 4. Let ¢ : G > G, be an isomorphism. Then o(o(g)) = o(g) for allg € G. 
Proof. It suffices to show that g* = 1 if and only if [o(g)]* =1. If g* =1, then 


[o(g)]* = o(g*) = o(1) =1 by Theorem 1. Conversely, if [o(g)]* =1, we have 
a(g*) = [o(g)|* = 1 = (1), so g* = 1 because o is one-to-one. | 


Example 18. If G is cyclic of order 6, show that aut G = {1g, A}, where \(g) = g™} 
for all g EG. 


Solution. Both 1g and (as G is abelian) \ are automorphisms of G. Ifo:G—=>G 
is any automorphism, we show o = lg or o = X. Write G = (a), where o(a) = 6. 
Theorem 2 shows that the choice of o(a) completely determines 0. By Theorem 4, 
we have o(a(a)) = o(a) = 6, so o(a) = a, or o(a) =a° =a"! If g € G, write g = a* 


for some k € Z, so that 
o(9) = a(a*) = [o(a)}*. 
ak 


If o(a) =a, this gives o(g) = a" = g for all g € G, that iso = 1g. If o(a) = aw}, it 
shows that o(g) 3 (a~1)* = (a*)~1 = g7? for all g € G, that is 0 = 2. O 


106 2. Groups 


Cayley’s Theorem 


We conclude this section with a proof of a theorem of Cayley (proved in 1878) that 
every finite group is isomorphic to a group of permutations. If X is a nonempty set, 
recall that Sx denotes the group of all permutations of X (bijections X — X) under 
composition. We need one simple observation about these permutation groups: If 
a bijection a: X — Y exists then Sx & Sy. Indeed, if AX € Sx we have 


a} Xr o 
YrRxXaxXAY 


so g\o7! € Sy. But then y: Sx — Sy given by y(A) = oAo™! is an isomorphism, 
as can be readily verified. In particular, Sx = S,, whenever |X| =n. 

Now let G be a group. We noted earlier that each row of the Cayley table 
of G is a permutation of G in the sense that each element appears exactly once. 
Since the row of a € G is {ag | g € G}, this is just the assertion that gt ag is a 
bijection G — G. This is the connection that Cayley noticed between the groups G 
and Sq. 


Theorem 5. Cayley’s Theorem. Every group G of order n is isomorphic to a 
subgroup of Sy. 


Proof. By the preceding discussion, there is an isomorphism 0: Sg — Sp. So if we 
can find a one-to-one homomorphism o : G — Sg, then G = 00(G) C S, because 
6c : G > 60(G) is an isomorphism, and Cayley’s theorem follows. 

Ifa € G, define pa : G > G by pfa(g) = ag for all g € G. Then it is easy to verify 
that pa is a bijection (so Ua € Sg). Hence define 6: G > Sg by o(a) = fg for all 
aéG. Then @ is a homomorphism because fap = Malt) for all a,b € G (verify). 
Finally, 6 is one-to-one because fg = fp implies that a = pa(1) = po(1) = b. Soo 
is a one-to-homomorphism, as required. | 


Cayley’s theorem shows that every abstract group of order n is (up to isomor- 
phism) a subgroup of S,,. Hence, to study the groups of order n, we need only study 
the symmetric group S,,. At first this approach seems to be an advantage because 
S,, consists of concrete mappings that can be analyzed using tools (such as cycle 
factorization and parity) not available in an abstract group. However, these 
symmetric groups are extremely large, so a subgroup of order n is lost in S, (for 
example, |.S19| = 10! = 3, 628, 800). However, in Section 8.3 we give a generalization 
of Cayley’s theorem that cuts down the size of the symmetric group and so provides 
more information about G. 


Arthur Cayley (1821-1895) Cayley showed his mathematical talent at an early age, 
quickly excelling at school. After some initial reluctance, his merchant father sent him 
to Cambridge at the age of 17. During the following 8 years he read the works of the 
masters and published more than 20 papers on topics that would occupy him for the 
rest of his life. In addition, he developed broad interests in literature (he read Greek, 
German, and French, as well as English), architecture, and painting (he demonstrated 
talent in watercolors) and became an enthusiastic hiker and mountaineer. 


At the age of 25, with no position as a mathematician in view, he began legal training 
and was admitted to the bar three years later. He earned a comfortable living as a 
lawyer but resisted the temptation to make a lot of money so as to free himself to 


2.5. Homomorphisms and Isomorphisms 107 


do mathematics. And do it he did, publishing nearly 300 papers in 14 years. Finally, 
in 1863, he accepted the Sadlerian professorship at Cambridge and remained there for 
the rest of his life, valued for his administrative and teaching skills, as well as for his 
scholarship. 


Although Cayley introduced the concept of an abstract group, his main accomplishments 
lay elsewhere. With his lifelong friend J. J. Sylvester, he founded the theory of invariants; 
he was one of the first to consider geometry of more than three dimensions; and he 
initiated matrix algebra. He also wrote on quaternions, the theory of equations, dynamics, 
and astronomy. He continued working until his death, leaving 966 papers filling 13 
volumes of 600 pages each. 


Exercises 2,5 


1. 


11. 
12. 


13. 


14. 


In each case show that a is a homomorphism and decide if it is onto or one-to-one. 


(a) a: R + GDo(R) given by a(r) =; 


(b) a: G + Gx G given by a(g) = (g,g) for all g in the group G. 


i] for all rin R. 


. If G=G,xGp is a direct product of groups, define 7;:G— G, and 01:G,-G 


by 71(94,92) = 9, and o1(g,) = (g;,1). Show that 7, is an onto homomorphism 
(called the projection of G onto G), and oj is a one-to-one homomorphism (called 
the injection of G, into G). 

If G is any group, define a: G—G by a(g) = 971. Show that G is abelian if and 
only if a is a homomorphism. 


. If m € Z is fixed and G is an abelian group, define a : G — G by a(a) = a™ for all 


a € G. Show that @ is a homomorphism. 
Let o, be the inner automorphism of G determined by a. Show that ao, = 1g if and 
only if a € Z(G). 


. Show that there are exactly two homomorphisms a: Cg — C4. [Hint: Example 9.] 
. Ifn>1, give an example of a group homomorphism o: G — G, and an element 


g © G such that o(g)= oo but o(a(g))=n. 

(a) Describe all group homomorphisms Z — Z. 

(b) How many are onto? 

If a: G— G, is a homomorphism, show that K = {g € G | a(g) = 1} is a subgroup 
of G (called the kernel of a). 


. Define 1: G—-G by X(g) = g for all g € G. Show that 2 is a bijection. If G is 


abelian, show that \ is an automorphism of G. 
If G is a group and a € G, show that the inner automorphism o,: G > Gis a bijection. 
In each case determine whether a : G — G is an isomorphism. Give reasons. 


(a) G=G,=R, a(x) = 22 (b)G=G,=Z, a(n) =2n 
(c) G@=Gi=Z3, a(g)=9? (d) G=G1=Z}, a(g) =9° 
(ec) G=G1=Z7, alg) = 29 (f)@=G,=Zs, a(g) = 29 
(g)@=G,=Rt, a(g) =9? (h) G=R,G,=Rt, a(g) =o(g) 
(i) G=2Z, Gj =3Z, a(2k) = 3k Gj) G=G,=R, a(g)=ag,a#0 
- -1 
Show that G= {l 4 : ; i ; : A ; & } is a subgroup of GL2(Z) 


isomorphic to {1,—1,2, —¢}. 
If G is an infinite cyclic group, show that G & Z. 


108 2. Groups 


15, 


16. 


17. 
18. 
19, 


20. 
21. 
22. 
23, 
24, 
25. 
26. 
27. 
28. 
29. 


30. 


31. 
32. 
33. 
34, 


35. 


36. 


37. 


38, 


If G = (a) is cyclic with o(a) = n, show that G & Z,. [Hint: k = m in Z,, if and only 
ifa*= a™ by Theorem 2 §2.4,] 

Show that o : C* > C* is an automorphism if o(z) = Z for all z € C (here Z denotes 
the complex conjugate of z). 

If g and h are elements of a group G, show that (gh) & (hg). 

If G is a group of order 2, show that Gx G & Ky. 

If ¢:G—G, is an isomorphism, show that Z(Gi) =o[Z(G)], where we have 
o(Z(G)] = {ol2) |z € Z(Q)}. 

Write nZ = {nk | k € Z}. Show that nZ & mZ whenever n # 0 and m 0. 

Show that Zj, is not isomorphic to Zi. 

Show that R is not isomorphic to R*. 

Show that the circle group C° = {z € C | |z| = 1} is not isomorphic to R*. 

Find two nonisomorphic groups of order n? for any integer n > 2. 

Are the additive groups Z and Q isomorphic? Support your answer. 

Show that Zi, & Zig. 

If G = (a) and G; = (b), where o(a)= o(b) = 6, describe all isomorphisms G — G). 
Show that R+ x C° = C*, where C° = {z € C| |z| = 1} is the circle group. 

Define t25:R—-R by Tas(t)=ar+b for all xweER, and denote 
Gi={rop|a,beR, a#0}. Let G= { ? i] a,beR,azoh, Show that G 
and G, are subgroups of GL2(R) and Sp, respectively, and that G & Gi. 


If G= { & a a,b ER, a and b not both of, show that G is a subgroup of 
M2(R)* and that G = C*. 

In each case, find aut G, where G = (a) is cyclic of ordern: (a)n=2 (b) n=3 

If G is infinite cyclic, determine aut G. 

If Z(G) = {1}, show that G = innG. 

Given z € Z(G), let G* denote the set G with a new operation a * b = abz~+. Show 
that G* is a group and G* &G. 

If G is a group and g € G, let S(g) = {o € autG | o(g) = g}. 

(a) Show that S(g) is a subgroup of aut G for all g € G. 

(b) If 91 = r(g), 7 € aut G, show that S(g) and S(gi) are conjugate in aut G. 

In a group G, write a ~ b if b = gag™ for some g € G (a is conjugate to b). 

(a) Show that ~ is an equivalence relation on G. 

(b) Determine which elements of G have singleton equivalence classes. 

If G = (X) anda: G — G; is an onto homomorphism, show that G, = (o(X)) , where 
o(X) = {o(x) | ae X}. 

Show that Zi, = Zij,. 


2.6 COSETS AND LAGRANGE’S THEOREM 


He [Lagrange] would set to mathematics all the little themes on physical inquiries which 
his friends brought him, much as Schubert set to music any stray rhyme that took his 
fancy. 


—Herbert Westron Turnbull 


2.6. Cosets and Lagrange’s Theorem 109 


In this section we prove one of the most important theorems about finite groups, 
Lagrange’s theorem, which asserts that the order of a subgroup of a finite group G 
is a divisor of |G|. This has far-reaching consequences as we shall see. The proof 
involves counting elements of G, and depends on the following basic notion. 

Let H be a subgroup of a group G. If a € G we identify two subsets of G: 


Ha = {ha|he€ H} — the right coset of H generated by a. 
aH = {ah|h€ H} — the left coset of H generated by a. 


We have H1 = H = 1H, so G is a right and left coset of itself. Also the fact that 
1 € H shows a € Ha anda € aH for all a. Of course, if G is abelian then Ha = aH 
for alla € G and all subgroups H of G. However, this may not hold if G is not 
abelian (see Example 5 below). 


Example 1, Let K4 = {1,a,b,ab} be the Klein group where o(a) = o(b) = 2 and 
ab = ba. If H = {1,a}, find the cosets of H in K4. 


Solution. H1 = H and Ha = {a,a”} = {a,1} = H too. Similarly, Hb = {b, ab} and 
Hab = {ab,a”b} = {ab, b} = Hb. Thus, there are exactly two cosets of H in Ky: 
H = {1,a} and Hb = {b, ab} = bH. 0 


Note that the cosets H = {1,a} and {b, ab} form a partition®® of K4. This holds in 
general and, with the other properties in Theorem 1, makes finding cosets easier. 


Theorem 1. Let H be a subgroup of a group G and let a,b€ G. 
(G) 7 =A. 
(2) Ha = H if and only ifa € H. 
(3) Ha = Hb if and only if ab" € H. 
(4) Ifa € Hb, then Ha = Hb. 
(5) Either Ha = Hb or HaN Hb= 2. 
(6) The distinct right cosets of H are the cells of a partition of G. 


Proof. First, (1) is clear because 1 € H and (2) follows from (3) with b= 1. 

— (3). If Ha = Hb then a € Ha = Hb, say a= hb, h € H. Hence ab! =he H. 
Conversely, suppose that ab“! € H. Then ha = h(ab-!)b € Hb, so Ha C Hb. But 
ba? = (ab-1)~1 € H too, so Hb C Ha follows in the same way. Hence Ha = Hb. 

(4) If a € Hb then ab“! € H so Ha = Hb by (38). 

(5) If HanHb# 2, we show Ha=Hb. If «x € Han Hb, then x € Ha so 
Ha = Ha by (4). Similarly Ha = Hb, so Ha = Hx = Hb. This proves (5). 

(6) If Ha + Hb then Ha and Hb are disjoint by (5). In other words, the set of 
right cosets is pairwise disjoint. Moreover, each a € G belongs to some right coset 
of H (in fact a € Ha). This gives (6). a 


Corollary. The analogue of Theorem 1 for left cosets also holds. In particular, 
(3) becomes aH = bH if and only if b-+a € H. See Exercise 5. 


A 
3°Recall (Section 0.4) that a partition of a nonempty set X is a collection of nonempty subsets 
of X (called the cells of the partition) which are pairwise disjoint (distinct cells are disjoint) 
and every element of X is in some cell (hence in exactly one cell). 


110 2. Groups 


Mnemonic. The condition in (3) that Ha = Hb if and only if ab! € H can be 
remembered by “right multiplying” Ha = Hb by 67}. Similarly, aH = bH if and 
only if b-'a € H can be recalled by “left multiplying” by 67}. 
Example 2. Let G = (a) where o(a) = 6. Find the right cosets of the subgroups 
H =a") and-K =a"): 
Solution. We have H = H1 = {1,a°}. Thus a? € H so H = Ha? by (4) of Theorem 
1. In the same way Ha = {a,a*} = Ha*, and Ha? = {a?,a°} = Ha®. This exhausts 
G so the cosets are 
H=({i,a7}, Ha={a,at}, and Ha? = {a?,a°}. 
Turning to K we find the partition in one step: 
K= {i,a",a*} and ~Ka={a,a*\a*}. oO 

In Example 2, the cosets of H (and those of K) do indeed partition G into 
pairwise disjoint cells, as Theorem 1(6) asserts. What is new here is that all the 
cosets of H have the same number of elements and, similarly, all the cosets of K 


have the same number of elements. This fact holds in general and lies at the heart 
of Lagrange’s theorem, as we show shortly. 


Example 3. Find all the right cosets of the subgroup 4Z in the additive group Z. 
Solution. The notation is additive, so the right coset of 4Z generated by a is 4Z + a. 
For a = 0, we obtain the coset 4Z itself: 

AZ, = 4Z4+0 = {4k | k € Z}. 
Now 1 ¢ 4Z, so it generates a new coset by Theorem 1: 

4Z+1= {4k +1|k € Z}. 
We continue in this way, with 2 and 3 generating new cosets: 

47, 4+-2= {4k +2|k € Z}, 

4Z7+3= {4k+3|k € Z}. 
This is a complete list of cosets, because every integer has the form 4k +1, where 
the remainder r is 0,1, 2, or 3. O 


Example 4. In the group C* give a geometrical description of the cosets of the 
circle group C° = {z € C | |z| = 1}. 

Solution. Recall that C° = {e% | @ any angle} is the unit circle. If z € C*, then 
z= re, where r = |z| > 0 and e € C°. Hence C°z = Cr = {re® | 6 any angle}. 
In other words, C°z is the circle with its center at the origin and radius r = |z|. 0 


All these examples involve abelian groups so Ha = aH always holds. However, 
this does not hold in general as Example 5 shows. 


Example 5. Let G = S3 = {€,0,07,7,70,T0"}, where 0? =e=7? and oro =T. 
Find the right and left cosets of H = {e,T}. 


Solution. As or = Ta! = To”, the cosets are 


H=Heeter}, Ho ={e,70},- Hot =10*,707', 
B22n je7}, oh =le,70*)) oh aie4 rok 
Observe that Ho # oH and Ho? # 07H in this case. 0 


2.6. Cosets and Lagrange’s Theorem 111 


Note that, even though the right and left cosets of H may be different, they all have 
the same number of elements and there are the same number of them. This holds 
in general, even when the cosets are infinite sets. 

Two sets X and Y are said to have the same cardinality if there is a bijection 
from one to the other, and in this case we write |X| = |Y|. Of course if the sets are 
finite this means that they have the same number of elements, and this terminology 
is sometimes used for infinite sets too. 


Lemma. Let H be a subgroup of a group G. Then 
(1) |A| =|Ha| = |aH| for allae G. 
(2) The map Hat a'H is a bijection {Ha| a € G} > {bH | be Gh. 
Proof. (1) |H| = |aH| since h + ha is a bijection H — Ha. Similarly, |H| = |Ha|. 
(2) We have Ha= Hb & ab-+€ H & a tH=b"'H by Theorem 1 and its 
Corollary. So the map in (2) is well defined and one-to-one. It is clearly onto. @ 


Part (2) of the Lemma shows that the sets of right and left cosets of a subgroup 
H of G have the same number of members (possibly infinite), and this common 
value has a name: 


The index |G: H| of H in G is defined to be the number of 
distinct right (or left) cosets of H in G., 


Note that a subgroup H can be of finite index in G even if both H and G are 
infinite (for example |Z : 4Z| = 4 by Example 3). 
The Lemma enables us to prove the single most important theorem about finite 
groups: It introduces numerical relations into the theory. 
Theorem 2. Lagrange’s Theorem. Let H be any subgroup of a finite group G. 
(1) Then |H| divides |G]. 


(2) The quotient im = |G: A| is the index of H in G. 


Proof. Write k = |G: H|, and let Ha1, Hag,--:, Ha, be the distinct right cosets of 
HA in G. Then 
G = Ha, U Hay U:-+U Haz 


which is a disjoint union by Theorem 1. By the Lemma, |Ha;| = |H| for each 7, so 


|G| = |Hay| + |Haa| +--- +| Han 
= |A|+|H|+---+|#| 
= k\H. 


This proves (1), and it also proves (2) because ae =k= |G: H|. al 


\ 
Note that Lagrange’s theorem shows that both the order and the index of a sub- 
group of a finite group G are divisors of |G]. 

Lagrange’s theorem has many important consequences. 


Corollary 1. If G is a finite group and g € G, then o(g) divides |G}. 


112 2. Groups 


Proof. The cyclic subgroup H = (g) generated by g has o(g) = |H| by the Corollary 
to Theorem 3 §2.4. So Lagrange’s theorem applies. | 


Note that the converse of Lagrange’s theorem (and of Corollary 1) is false. For 
example, |A,| = 12, but Ag has no subgroup of order 6 and hence no element of 
order 6 (Exercise 34). 


Corollary 2. If G is a group and |G| = n, then g” = 1 for every g EG. 


Proof. If o(g) =m then m|n by Corollary 1, say n = gm for some q € Z. But then 
g? =(g™) = 19 =1. Bi 


The next corollary will be referred to later, and illustrates how the numerical 
information in Lagrange’s theorem can determine the structure of a finite group. 


Corollary 3. If p is a prime, then every group G of order p is cyclic. In fact, G = (g) 
for every element g #1 in G, so the only subgroups of G are {1} and G. 


Proof. Let g #1 in G and write H = (g). Then |H| divides |G| = p, so |H| = 1 or 
|H| =p. But |H| #1 because H contains both 1 and g # 1, so |H| = p = |G|. This 
implies that H = G because G is finite. Finally, if K # {1} is a subgroup of G, and 
1#keEK, then G= (k) CK CG, andso K =G. a 


Corollary 4. Let H and K be finite subgroups of a group G. If |H| and |K| are 
relatively prime, then HN K = {1}. 


Proof. As HO K is a subgroup of both H and K, |HM K| must divide both |A| 
and |K| by Lagrange’s theorem. Since |H| and |K| are relatively prime, it follows 
that |H M K| = 1. The Corollary follows. a 


Example 6. Let K CH CG be finite groups. ie |G: K| is a prime, show that 
H=KorH=G. 


Solution. By Lagrange’s theorem, |G: H|-|H: K|= iy : ral = a =|G: K|. 


Since |G: K| is a prime, either |G: H| = 1 or |H : K| = 1; that is either H = G or 
ade O 


We showed earlier (Example 18 §2.2) that every group of order 4 = 2? is either 
cyclic or is isomorphic to the Klein group. In Example 7, Lagrange’s theorem is 
used to give an analogous result for any prime p in place of 2. 


Example 7. If G is a group and |G| = p? where p a prime, show that either G is 
cyclic or g? = 1 for every element g € G. 


Solution. Assume that G is not cyclic. Then o(g) | p®, 80 o(g) = 1,p, or p*. But 
o(g) # p* because G is not cyclic, so o(g) is 1 or p. Hither way g? = 1. O 


Dihedral Groups 
Recall (Example 8 §2.2) that the group 53 can be presented as follows: 


S3 = {e,0,07,7,70,T07}, o(o) = 8, o(r) = 2, and ora =T. 


2.6. Cosets and Lagrange’s Theorem 113 


In fact, we can take o =(1 2 3) and r=(1 2), but the point here is that the 
three conditions o(c) = 3, o(r) = 2, and oro = 7 are themselves sufficient to fill in 
the Cayley table of the group. 

We now construct a family of groups D2, D3,...,Dy,... each presented in much 
the same way as $3, and having D3 = S3. We realize them as subgroups of the group 
GL2(C) of 2 x 2 invertible matrices with complex entries. 

Let n > 2 be fixed and let w = e?*/" (an nth root of unity). Then o(w) =n in 
C*. Consider the matrices: 


w 0 0 1 
aS and b= |! | 
in GL2(C). It is easy to verify that o(a) =n, 0o(b) = 2, and aba =b. This last 
equation shows that ab = ba! = ba”"', and hence that the finite set 


G = {I,a,a?,...,a"!,b, ba, ba’,...,ba™ 1} 


of matrices is closed under matrix multiplication (I is the 2 x 2 identity matrix). 
Hence G is a subgroup of GLe(C) by Theorem 2 §2.3. For convenience, write 
A = (a). Then |A| = n and, as b ¢ A, the left cosets A and BA are disjoint. Hence 
|G| = 2n. We abstract this situation as follows. 

If n > 2, the dihedral group D,, is the group of order 2n presented as follows: 


Diaz fh," 03Gb baba” svana 
where o(a) = 7, 0(b) = 2, and aba = b. 


Note that the requirement that |D,,| = 2n is equivalent to insisting that b ¢ (a). 
We can carry out all calculations in D,, by using the conditions o(a) = n, o(b) = 2, 
and aba = b. The tquation aba = b implies that a*ba* = b for all k € Z by induction 
(Exercise 25), and so we obtain 


a*b = ba-* = ba™-* and o(ba*)=2 for allk € Z. 


In particular, ab = ba"~1, and these formulas enable us to fill in the Cayley table 
for D,. Hence the conditions o(a) = n, o(b) = 2, and aba = b completely determine 
the group D, (up to isomorphism). The group Dy is called the octic group, and 
its Cayley table is as follows (the reader should verify this): 


De 1 @ io? ae sh. - ba ~ ba® ba? 

1 1 a a a® bs ba_—s bas ba? 

a a 0 a T. -bo® 96 baba? 
a? a wae 1 a ba? ba® b- ba 
a? a 1 a  @ ba ba? ba® 6 
b bs be: . sba® ba. oe. a a? 0? 
ba ba ba? ba® b ae 1 a aa 
ba? |.ba? ba® b ba a® a 1 a 
ba? | ba® = =6blloba ib? lat B 1 


The group D3 is isomorphic to S3 because 
D3 = {1,a,a7,b, ba,ba?}, o(a) = 3, o(b) = 2, and aba = b 
which is the same as the presentation of S3 given previously. If n = 2, 
Dz = {1,a,b,ba}, o(a) = 2, o(b) = 2, and aba = b. 


114 2, Groups 


We have a~! = a here because o(a) = 2, so Dz is abelian (ba = a~'b = ab ) and is 
isomorphic to the Klein group K4. ; 

Thus, every group of order 4 is either cyclic or dihedral (Example 18 §2.2). The 
next theorem shows that this result holds for groups of order 2p where p is a prime. 


Theorem 3. Let G be a group of order 2p where p is a prime. Then either G is 
cyclic or G = Dp. 
Proof. First, the theorem is true if p= 2 because |G| = 4 implies G is cyclic or 
G = Ky, = Do. Hence we assume that p is odd. 

Assume that G is not cyclic. Hence o(g) = 1, 2, or p for every g € G by Corollary 
1 of Lagrange’s theorem. We must show that G = D,. 


Claim 1. G has an element of order p. 


Proof. If not, g? =1 for all g € G, so G is abelian by Exercise 20 §2.2. Hence 
if 1,a, and b are distinct in G, then {1,a,b,ab} is a subgroup of order 4 by 
Theorem 2 §2.3, contrary to Lagrange’s theorem. This proves Claim 1. 


So let a € G have order p and write H = (a) = {1,a,a?,...,a?-+}. 
Claim 2. If x € G and a ¢ H, then o(z) = 2. 


Proof. We have G = H U Ha s0, because x” ¢ Hx, we must have x? € H. If o(x) = p 
then, since p is odd, 2 = 2?+! = (x”)*= € H, contrary to the choice of x. Thus 
o(z) # p, so o(x) = 2 (x #1 because x ¢ H). This proves Claim 2. 
Now choose b ¢ H. Then G = H UDbH, a disjoint union, so we obtain 
Ga fly age ja? oy 8 bdy ba", a12,0a? * 


As o(b) = 2 by Claim 2, it remains to show that aba = b. But ba ¢ H so (ba)? = 1 
again by Claim 2. But then aba = b-1(ba)? = b-1 = b. Thus G & Dy. w 
Theorem 3 together with Corollary 3 determines all groups G with |G| < 7: 
ea oe oe a 4 5 6 7 


G[Ci={1} Cp Cs Ci, Ka Cs Cy,D3 Cy 


Note that 4 & C2 x C2, so every abelian group here is (isomorphic to) a direct 
product of cyclic groups. In fact this is true for every finite abelian group, an 
important result discussed in Chapter 7. 

Obviously, the list continues. We will show that there are five nonisomorphic 
groups of order 8: Cg, Cy x C2, Co x Co x Co, D4, and another group @ called the 
quaternion group, to be introduced in Section 2.8. The groups of order 9 are Cy 
and C’3 x C3—both abelian,) and there are two distinct groups of order 10; Cio and 
Ds. The next interesting case is the groups of order 12 (there are five). However, 
it is not our intention to imply that all the distinct groups of order n have been 
determined for an arbitrary integer n. That is a very difficult task! 

We conclude with an application of Lagrange’s theorem to number theory. If 
n > 2, the Euler function ¢ is defined by 


y(n) is the number of integers k € {1,2,--- ,n —1} with gcd(k,n) = 1. 
We define y(1) = 1. Hence y(2) = 1, y(3) = 2, (4) = 2, y(5) = 4, and y(6) = 2. 
Clearly, 
y(p) = p — 1 whenever p is a prime. 


2.6. Cosets and Lagrange’s Theorem 115 


Now recall (Theorem 5 §1.3) that, for n > 2, the group of (multiplicative) units 
in Zn is given by Z, = {k|1<k <n and ged(k,n) = 1}. Hence 
Ifn > 2 then y(n) = |ZF |. 


With this, Lagrange’s theorem yields an elegant proof of the following famous result 
in number theory. 


Theorem 4. Euler’s Theorem. Ifa and n > 2 are relatively prime integers, then 
a?(”) = 1 (modn). 


Proof. We have @ € Z;,. Since |Z7,| = p(n), Lagrange’s theorem (Corollary 2) gives 
a?(") — 7 in Z*. Buler’s theorem follows. | 


A special case gives another proof of Fermat’s theorem (Theorem 8 §1.3). 


Corollary. Fermat’s Theorem. If p is a prime, then a? =a(modp) for all 
integers a. 


Proof. This is clear if a =0 (modp). Otherwise, a and p are relatively prime so, 
because y(p) = p—1, Euler’s theorem gives a?-! =1 (modp). Fermat’s theorem 
follows. | 


Joseph Louis Lagrange (1736-1813) While his name sounds French, Lagrange was 
born in Italy and spent his early years in Turin. In 1766, he was appointed as Euler's 
successor at the Berlin Academy by Frederick the Great, who suggested that the “great- 
est mathematician m Europe” should be at the court of the “greatest king in Europe.” 
After the death of Frederick, Lagrange went to Paris at the invitation of Louis XVI. 
He remained there throughout the revolution and was made a count by Napoleon who 
called him the “lofty pyramid of the mathematical sciences.” 


Lagrange was one of the great mathematicians of all time. He made important contribu- 
tions to many parts of mathematics, including number theory, the theory of equations, 
differential equations, celestial mechanics, and fluid dynamics. At age 19 he solved a 
famous problem, the so-called isoperimetrical problem, by inventing an entirely new 
method, known today as the calculus of variations. His work brought a new level of rigor 
to analysis. In addition to his mathematical achievements, he was a master of expo- 
sition, and his Mechanique Analytique is a masterpiece that William Rowan Hamilton 
described as a “scientific poem,”. 


In his work on the theory of polynomial equations, Lagrange studied the permutations 
of the roots of an equation in the hope of finding a general method of solution. He saw 
that, because the symmetric groups 52, S3, and S4 were sufficiently “nice” a general 
solution can always be found if the degree is 2, 3, or 4. But he never discovered what 
it was about Ss that obstructed the solution of equations of degree 5. Abel, and later 
Galois, eventually clarified the matter. Nevertheless," Lagrange’s work provided one of 
the sources from which the modern theory of groups evolved. 


Exercises 2.6 
1. In each case find the right and left cosets in G of the subgroups H and K of G. 
(a) G = (a), o(a)= 20; H = (a*), K = (a?) 
(b) G= Ag; H={e, (1 2)(3 4), 1 3)(2 4), (1 4)(2 3)}, K=((1 2 3)) 


116 


bo 


or 


10. 


11. 


12. 


13. 


14, 
15. 


16. 


17. 
18. 
19. 


20. 
21. 
22, 


2. Groups 


(c) Gu Z; H = 22, K = 32 

(d) G => Z193 AH = 3Zy2,; K = 2219 

(e) G = Da = {1,a, a”, a3, b, ba, ba?, ba}, o(a) = 4, o(b) = 2, and aba = b; 
H = (a?), K =(b). 

(f) G = any group; H is any subgroup of index 2 


. If G is any group, describe the cosets in G of the subgroups {1} and G. 
. If ZH is a subgroup of G and Ha = Hb where a,b € G, does it follow that aH = bH? 


Support your answer. 


. IK CH CG are finite groups, show that |G: K| =|G: H|-|H: K]. 
. If H is a subgroup of G and a,b € G, definea =bifbae H. 


(a) Show that = is an equivalence relation on G. 
(b) Show that the equivalence class (Section 0.4.) of a € G is the left coset aH. 


. Let G=RxR with addition (a, y)+(a’,y')=(c#+2',y+y’). Let H be the line 


y = mz through the origin: H = {(z, mz) | z € R}. Show that H is a subgroup of 
G and describe the cosets H + (a,b) geometrically. 


. Let H be a subgroup of G and suppose that Ha=bH for a,b€G. Show that 


aH = Hb. 


. Let H and K be subgroups of G. If Ha C Kb for some a,b € G, show that H C K. 
. In each case give a geometric description of the cosets of H in G. 


(a) G=R*, H=Rt (b) G=C*, H=R* 

(c) G=R,H=Z (d)G=C,H=R 

(a) If G = (a) and o(a) = 30, find the index of (a) in G. 

(b) Let G = (a), o(a) =n. If din, find the index of (a*) in G. 

Let H and K be subgroups of some group G. 

(a) Show that Han Ka= (HN K)a for alla eG. 

(b) Given a,b € G, show that either HaM Kb is empty or Han Kb=(HNK)c for 
some cE G. 

Let G denote a group and let g € G. In each case show G = (g). 

(a) |G| = 12, 9 #1, ® #1. 

(b) |G] = 40, g§ #1, 9° #1. 

(c) |G| = 60, g°° #1, 97° #1, and g* #1. 

(d) Generalize. [Hint: Prime factorization.] 

Let K = {e, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}, and let H be a subgroup of A, 
containing K. If H contains any 3-cycle, show that H = Ag. 

Suppose that G has subgroups of orders 45 and 75. If |G| < 400, determine |G]. 

If H and K are subgroups of a group and |H| is prime, show that either H C K or 
HK = {1}. 

Let G be a group of order n and let m be an integer with gcd(m,n) = 1. 

(a) If g™ = 1 in G, show that g = 1. [Hint: Theorem 4 §1.2.] 

(b) Show that each g € G has an mth root, that is that g = a for some a € G. 

Let |G| = p”, where p is a prime. Show that every proper subgroup of G is cyclic. 
Let |G] = p3, where p is a prime. If G is not cyclic, show that g?” = 1 for allg €G. 
Let a* = b* in a group. If o(a)= m and o(b)= n, where n and m are relatively prime, 
show that mn divides k. (Hint: Lagrange Corollary 4 and Theorem 5 §1.2.] 

Show that |Z*| is even if n > 3. [Hint: Corollary 1 of Lagrange’s theorem.] 

Show that |Z: nZ| =n for every n > 1. 

If G is a group of order n, define o:G—G by o(g)=g™ for all geG. If 
gced(m,n) = 1, show that o is a bijection (an automorphism if G is abelian). 


23. 
24. 
25. 
26. 
27. 


28. 
29. 


30. 


31. 


32. 


33. 


34, 


35. 


36. 


2.7. Groups of Motions and Symmetries 117 


If G is a group of order p*, where p is a prime and k > 1, show that G must have an 
element of order p. [Hint: Theorem 5 §2.4.] 

If G is a group of order pq, where p and q are primes, show that every proper subgroup 
of G is cyclic. 

(a) In D,, show that a*ba* = b for all k € Z. 

(b) In D,, show that o(ba*)= 2 for all k € Z. 

If n > 3, show that Z(D,) = {1} if n is odd, and that Z(Dom) = {1,a™}. 

Is Ds x C3 & D3 X Cs? Prove your answer. 

If k|n, & > 2, show that D, has a subgroup isomorphic to Dy. 

Let G be a group and let p be a prime. 

(a) If H and K are subgroups of order p, show that H = K or HN K = {1}. 

(b) If Hy, Ho,--+ , H, are distinct subgroups of order p, show that 


|H, UH2U+++U Ha] =1+k(p—1). 


(c) If |G| = 15, show that G must have an element of order 3. 

Let G be any group (possibly infinite) that has no subgroups except {1} and G. If 
|G| > 2, show that G is finite and cyclic and that |G| is prime. (Converse of Corollary 
3 of Lagrange’s theorem.) 

Let K C H CG be groups. Show that both |G: H| and |H : K| are finite if and only 
if |G: K| is finite, and then |G: K|=|G: A||H: K]. 

[Hint: If |H: K| =n, let Khi, Kho,::+ , Kh», be the distinct cosets of K in H. Show 
that Hg = KhigU KhogU---U Khyg is a disjoint union for all g € G.] 

Let H and K be subgroups of a group G with |G: H| =m and |G: K| =n. 

(a) Show that |G: HN K| < mn. [Hint: (HN K)g = Hgn Kg for allgeG] 

(b) If ged (m,n) = 1, show that |G: HN K|= mn. [Hint: Exercise 31] 

Prove Poincaré’s Theorem: If H,, H2,--+- ,H, are subgroups of a group G of finite 
index, then H, 1 H2M---M Hy, is also of finite index. [Hint: Exercise 32.] 

Show that A, has no subgroup of order 6, and hence that the converse of Lagrange’s 
theorem is false. [Hint: Theorem 3.] 

If H and K are subgroups of a group G, define a relation = on G by a= bifa=hbk 
for some he H andke K. 

(a) Show that = is an equivalence on G. 

(b) Describe the equivalence classes (called double cosets). 

If y is the Euler function, show that n = )> dln y(d), where the sum is taken over all 
positive divisors of n. [Hint: Theorems 8 and 9, Section 2.4.] 


2.7 GROUPS OF MOTIONS AND SYMMETRIES 


Group theory began with the study of subgroups of the symmetric group S,. In 
this short section we discuss some of these groups, which arise from the symmetries 
of geometric figures. By a figure we mean a finite set of points called vertices, 
some pairs of which are joined by line segments. A motion of a geometric figure is 
a permutation of its vertices that can be realized by a rigid motion in space. 


Given two motions o and 7 of a figure, the composite o7 is also a motion 


obtained by first doing 7 and then o. Similarly, o~! is a motion achieved by reversing 


118 2. Groups 

the motion that led to o. Finally, the identity permutation ¢ is a motion (resulting 
from doing nothing at all). Hence the subgroup test gives Theorem 1. 

Theorem 1. The set of motions of a figure with n vertices is a subgroup of Sy. 
This theorem leads to many interesting groups. 


Example 1. Find the group of motions of a (nonsquare) rectangle. 


Solution. Label the vertices as shown. Then the : i 
motions (1 2)(3 4) and (1 4)(2 3) result from ro- | | 
tating the rectangle + radians (180°) about the ver- ‘ 


tical and horizontal axes of symmetry, respectively. 
The composite of these is (1 3)(2 4), which is the motion obtained by a rotation 
of 180° in the plane of the rectangle. Hence 


G= fe, (1 2)(3 4), (1 4)(2 8), (1 3)(2 4)} 
is the group of motions. This group is isomorphic to the Klein group. 0 


Example 2. Find the group of motions of an equilateral triangle. , 


Solution. Label the vertices as shown. The motions 

o=(1 2 3) and o®=(1 8 2) are achieved by clock- re 
wise rotations of 27/3 radians (120°) and 47/3 ra- 

dians (240°), respectively. In addition, r=(1 2) is 

realized by rotating the triangle 7 radians (180°) about the line through vertex 3 
and the midpoint of the opposite side. Similarly (1 3) and (2 3) are motions, so 
the group is 


3 2 


Seate,;- (2 3) (3 2), (2), (9), OS). 


This shows that the equilateral triangle is highly symmetric because every possible 
permutation of the vertices can be obtained by a rigid motion in space. Oo 


It is vital that the rigid motions allowed in Example 2 are rigid motions in 
space. If the only motions allowed were those in the plane of the triangle, the group 
of motions would be {e,(1 2 3), (1 3 2)}. The permutations (1 2), (1 3), and 
(2 3) cannot be achieved by rigid motions in the plane of the triangle. 

A striking illustration of this phenomenon results when we consider the group 
G of motions of a tetrahedron. This figure is three-dimensional, with four vertices 
and six edges of equal length as in the di- 
agram. Clearly (1 2 3) is a motion of the 1 
tetrahedron, obtained by a rotation of 21/3 
radians (120°) about a line through vertex 4 
and the center of the opposite face. Similarly, 
all 8 cycles are in G. The three permutations 
(1 2)(3 4),(1 3)(2 4),and(1 4)(2 3) are 
also motions, so Ag C G, where Ay is the 
alternating group of all even permutations: 3 

(1 2)(3 4) 
Ag = E (1 4) (2 3) 
(1 3)(2 4) 


(°F By Day (a8), 8 A) 
(13.2) (142) (14 3) (2 4 3) 


2.7. Groups of Motions and Symmetries 119 


We claim that A, = G. Suppose on the contrary that o € G is an odd motion. 
If y=(1 2), write 7 = yo. Then 7 is even so 7€G and hence y=To7! is in 
G because G is a group. But the transposition y= (1 2) is not a motion of the 
tetrahedron, because interchanging vertices 1 and 2 by a rigid motion necessarily 
interchanges 3 and 4. It follows that 


Example 3. The group of motions of the tetrahedron is Ag. 


This situation is analogous to that for the equilateral triangle in Example 2, 
where the group of motions is A3 = {e,(1 2 3),(1 3 2)} if the motions are re- 
stricted to the plane containing the triangle. Any odd permutation is achieved as a 
motion only if the triangle is pulled out of its plane, flipped over, and placed back 
in its plane. Similarly, no odd permutation of the vertices of a tetrahedron can be 
realized by a motion in 3-space. It can be achieved only if the figure is “moved” 
into 4-space in the process. 

Even so, these odd permutations of the vertices of a tetrahedron are symmetries 
of the figure in the intuitive sense of the word. To make this precise, we let d(z, y) 
denote the distance between two points x and y in space. As in Section 1.4, if 
a € Sy, we write o(k) = ok for all integers k. Given a geometric figure with n 
vertices labeled 1,2,--- ,n, a symmetry of the figure is a permutation o of the 
vertices that preserves the distance between any two vertices; that is 


d(ok,om)=d(k,m), — for all k, m=1,2,...,n. 


Clearly, any motion of a figure is a symmetry, but the converse is not true. For 
example, the transposition y = (1 2) is a symmetry of the tetrahedron, but it is 
not a motion, as we have demonstrated. 


Theorem 2. The symmetries of a figure with n vertices are a subgroup of Sy. 


Proof. The identity permutation is clearly a symmetry. If o and 7 are symmetries, 
then for vertices & and m we have 


d{(or)k, (or)m] = dlo(rk),o(rm)] = d(rk, rm) = d(k, m). 


Hence or is a symmetry. Finally, write o~'k = ky and o~!m = m,. Then k = ck, 
and m = omy, 80, since o is a symmetry, 


d(a-'k, o~'m) = d(ky,m1) = d(oky,0m,) = d(k,m). 
This shows that o~! is a symmetry and so completes the proof. a 


Now let G denote the group of symmetries of the tetrahedron. Then Example 
3 gives Ay C GC Sy, and Ay # G because (1 2) € G. This implies that G = S4 
because |S, : A4| = 2 is a prime (see Example 6 §2.6). Hence 


Example 4. The group of symmetries of the tetrahedron is 94. 


The group of motions (in 3-space) of a geometric figure is thus a subgroup of 
the group of symmetries, and the two may be distinct as the tetrahedron shows. 
However, if the figure can be drawn in a plane, the two groups coincide. The reason 
comes from a theorem of plane geometry. Call a mapping o from the plane to itself 
an isometry if it preserves distance; that is if d[a(x),o(y)] = d(x, y) for all 2 and 
y. It can be shown that every isometry of the plane is a composite of translations, 
rotations about a point, and reflections in a line. Translations and rotations result 


120 2. Groups 


from motions in the plane itself, whereas reflections can only be achieved by motions 
in 3-space. Thus, every isometry of the plane (and hence every symmetry of a plane 
figure) is a motion in 3-space. Of course, this condition breaks down for a three- 
dimensional figure because reflections in a plane are isometries of 3-space that are 
not motions of 3-space. 

We conclude this section by representing the dihedral group D, as a group of 
motions. If n > 3, a regular n-gon is a plane figure with n vertices evenly placed 
on a circle. Thus, a regular 3-gon is an equilateral triangle, a regular 4-gon is a 
square, and so on. Consider the group G of all motions of a regular n-gon. There 
are two obvious motions: 


(1)o=(1 2 3 --- n)—+the clockwise rotation of 27/n radians (360/n°) 
about the center of the figure. 

(2)r =(1 n-1)(2 n—2)(3 n—3)--- —the rotation of m radians (180°) 
about a line through the vertex n and the center of the figure. 


If n is odd, then 7 fixes only the vertex n, whereas if n = 2m, then 7 fixes n and 
m (see Figure 2.1). If A is any motion of the n-gon, » is determined by its effect 
Al and A2 on vertices 1 and 2. If A2 follows A1 (clockwise round the n-gon) then 
d = o* for some k. On the other hand, if \2 precedes Al, then \ = ro* for some k. 
For example, ifn = 7 and the effect of » is that shown in Figure 2.2, then \ can be 
achieved by ro* as shown in Figure 2.3. 


FIGURE 2.1 The difference when n is odd or even. 


7 4 
6 1 5 3 
my 
5 2 6 2 
4 3 7 1 


FIGURE 2.2 The effect of the motion 2. 


FIGURE 2.3 The effect of vo‘ is that of X. 


2.7. Groups of Motions and Symmetries 121 


Because |o| = n and |r| = 2, it follows that 


G = {e,0,07,-+,0 


n-1 


2 n-1 
TO Te ee ate 


Thus G = (a) Ur (a) , so |G| = 2n. Moreover, the relation oro = 7 is valid as the 


following diagram shows. 


a) 


1 


n-1l —> 


n — 


n 
1 


loa 


i; 


> 


= 


n—-2 
n—-38 
n 
n-1 


| 


= 


_~ 


n—-1 


n 


Because |a| = n, |r| = 2, and ora = 7, the definition of D,, (in § 2.6) proves 


Theorem 3. The group of motions of a regular n-gon is isomorphic to Dy. 


If n = 3, Theorem 3 shows that the group of motions of an equilateral triangle is 
isomorphic to D3, as is clear from Example 2. If n = 4, it shows that the group of 
motions of the square is isomorphic to the octic group D4. 


Exercises 2.7 


1. Find the group of motions of the diamond 
shown—~all edges, and the horizontal diagonal, 


of length 1. 


3 
2. Describe a symmetry of the cube that is not a (three-dimensional) motion. 


8. Consider the figure where the base edges are of 


length 1 and the sloped edges are of length 2. 


(a) Find the group of (three-dimensional) mo- 


tions. 


(b) Find the group of symmetries. 


4 


122 2, Groups 


4. Consider the figure where all the edges have 
length 1 and the base is square. 
(a) Find the group of (three-dimensional) 
motions. 
(b) Find the group of symmetries. 


1 2 
5. If the double-marked edges of the square shown 
are painted blue, find the subgroup of the sym- 
metries that carry blue edges to blue edges. 
4 3 


6. (a) Find the group of (three-dimensional) 
motions of the figure where the triangle edges 
are of length 1 and the sides are 1 x 2 rectangles. 
(b) Find the group of symmetries of the figure. 


7. Find the groups of motions and symmetries 
of the figure where each face is a nonsquare 
rectangle. 


2.8 NORMAL SUBGROUPS 


If H is a subgroup of a group G, we have seen that aH = Ha may fail to hold for 
some a € G (Example 2 below). A subgroup H of a group G is called a 

normal subgroup of G if gH = Hg holds for all g in G. 
In this case H is said to be normal in G, written H < G. These subgroups are of 
fundamental importance in group theory, and in this section we begin to see why. 


Example 1. If G is any group, {1} 4 G because g{1} = {g} = {1}g, and GIG 
because gG = G = Gg for allg EG. 
Example 2. Let S3 = {€,0,07,7, 70,07}, where o(c) = 3, o(r) = 2, and ora =T. 
If H = {e,0,07} and K = {e,r}, show that H < $3 but that KAS3. 
Solution. Clearly, aH = H = Ha for all a € H. Because ot = ra? and o?r = To, 
we get 

Hreaiiot, oT) =(n 76" 70} =r, 


2.8. Normal Subgroups 123 


Similarly, Hro = roH and Hro* = 707H, so H <1 S3. 
However, oK = {o,oT} and Ko = {0,07}, so oK # Ko. Hence KfS3. O 


Let H be a subgroup of a group G. If g € G satisfies gh = hg for all h € H, then 
obviously gH = Hg. In particular, this condition holds if each element h of H is in 
the center Z(G) of G. This proves Theorems 1 and 2. 


Theorem 1. If G is a group, every subgroup of the center Z(G) is normal in G. In 
particular, Z(G) 4G. 


Theorem 2. If G is an abelian group, every subgroup of G is normal in G. 


Note that, given g € G, it is not necessary that gh = hg for all h € H to ensure 
that gH = Hg. For example, to show that gH C Hg it is only necessary to show 
that, given h € H, gh =h’g for some h' € H. 

The converse of Theorem 1 is false: The subgroup H in Example 2 is normal 
in $3, but H is certainly not central in S3 (in fact, Z(S3) = {e}). The converse 
of Theorem 2 is also false: Example 9 below exhibits a nonabelian group in which 
every subgroup is normal. 


Example 3. If K = {e, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}, show that K <q Ay. 


Solution. If o € S, and y= (ky ko --+ ky) is a cycle of length r, then oyo™! is 
also a cycle of length r, in fact oyo"! = (ok, oko ++. ok,)—see Lemma 3 below. 
With this, let (a b)(c d) € K. Ifo € S4, then 

al(a b)(c d)jo-+ =o(a b)o! o(c d)o- = (ca ob) (cc od) EK. 
It follows that K <1 S,, so certainly K <1 Ag. Oo 


Theorem 3. Normality Test. The following conditions are equivalent for a sub- 
group H of a group G. 

(1) H is normal in G. 

(2) gHg"' CH forallg €G. 

(3) gHg+ =H for all g €G. 


Proof. (1) = (2). Let « € gHg™', say x = ghg"'. Then gh € gH = Hg by (1), say 
gh = hig. Then z = ghg"! = hygg™* = hi € H. This proves (2). 
(2) = (3). If g € G then gHg™' C H by (2). Taking g~ in place of g in (2), we 
obtain g-1Hg C H. This implies H C gHg™' (verify), so H = gHg™, proving (3). 
(3) = (1). Given g € G, we have gHg~! = H by (3). If x € gH, this shows that 
ag '€ H, so x € Hg. This proves gH C Hg. Since g- 1Hg =H by (3) (with g™! 
replacing g), a similar argument shows that Hg C gH. Now (1) follows. A 


Conditions (2) and (3) in Theorem 3 become even more useful if G has a known 
set X of generators (see Theorem 10 §2.4). The proof is Exercise 13. 


Corollary 1. If G = (X), a subgroup H is normal in G if and only ifxHa~' C H 
for all x € X. In particular, (a) < G if and only if gag! € (a) for allg EG. 


If H is a subgroup of G and g € G, recall (Theorem 5 §2.3) that gHg™! is also a 
subgroup of G which is isomorphic to H,*! and is called a conjugate of H in G. For 


31In fact gHg"* = o0,(H) where o, : G — G is the inner automorphism determined by g. 


124 2. Groups 


this reason, normal subgroups of G are sometimes called self-conjugate subgroups. 
Incidentally, this discussion proves 


Corollary 2. If H is a subgroup of G, and if G has no other subgroups isomorphic 
to H, then H is normal in G. 


In particular, if H is finite and H is the only subgroup of its order, then H dG 
because |gHg™*| = |H| for all g EG. 


Theorem 3 suggests a stronger condition than normality. A subgroup H of a 
group G is called a characteristic subgroup of G if o(H) = H for all automor- 
phisms o : G — G (equivalently if o(H) C H for all automorphisms o). The center 
Z(G) is characteristic in G, and other examples are given in Exercise 24. If o = oq 
is the inner automorphism induced by a € G then aHa™! = o,(H), and it follows 
that characteristic subgroups are necessarily normal, However, the converse is false 
by Exercise 24 (c). The following result is often useful. 


Corollary 3. If K < G and H C K is characteristic in K, then necessarily H 1G. 


Proof. lia € G, og: K — K is an automorphism of K because K <1 G. It follows 
that o,(H) = H because 4H is characteristic in K. Hence K 4G. a 


Corollary 3 fails if H is merely normal in K (Exercise 4). Many important subgroups 
are characteristic subgroups (for example, the center); some of their properties are 
given in Exercise 24. 


Example 4. Let G = GLI2(R) and let H be the subgroup of all matrices with 
determinant 1. Show that H <4 G. 


Solution. If A € G and B € H, the properties of determinants give 
det(ABA~*) = det A det B det At = det A-1- 544 =1. 
This shows that ABA™ € H, so H is normal in G by part (2) of Theorem 3. O 


Theorem 4. If H is a subgroup of index 2 in G, then H is normal in G. 


Proof. Let ae G. If ac H, then Ha= H =adH. If a¢ H then (because H has 
exactly 2 right cosets) G = H U Ha, a disjoint union. Hence Ha = G \ H. Similarly 
aH =G\~ H as H has two left cosets, so Ha=G\H=aH.Thus,H<G. @ 


Note that subgroups of index 3 need not be normal (Example 2). 
Example 5. Show An <1 Sn where A, is the alternating group. 
Solution. The alternating group A, is of index two in S, (Theorem 8 §1.4). O 


Example 6. Let D, = {1,a,a7,-+- ,a"~1, b, ba, ba”, -+- ,ba”-1} denote the dihedral 
group, where o(a) = n, o(b) = 2, and aba = b. Then (a) < D, by Theorem 4 because 
(a) = {1,a,--- ,a"-+} has index 2 in Dp. 0 


We defined H < G to mean gH = Ho for all g € G, which is a kind of commu- 
tativity condition on. H. The next result gives a situation where actual commuting 
of elements is implied. It will be referred to later. 


Lemma 1. Let H dG and K 4G. If HM K = {1}, then hk = kh for all elements 
he Handke kK. 


2.8. Normal Subgroups 125 

/ 
Proof. Consider « = hkh~!k~1. Thinking of z = h(kh-'k7!) we see that « ¢ H 
because kh-tk- € kHk7! = H since H 4G. Similarly, writing 2 = (hkh7!)k7! 
shows that « € K because K < G. Hence x € HM K = {1} by hypothesis, so x = 1. 
But then hkh7!k7! = 1, which gives hk = kh. | 


Let H and K be subgroups of a group G. The intersection HN K is the largest 
subgroup of G contained in both H and K (it contains any such subgroup), and 
one wonders if there is a smallest subgroup of G containing both H and K. Note 
that, while H U K is the smallest subset containing H and K, it is a subgroup only 
if H C K or K C H (see Exercise 17 §2.3). A much more useful construction turns 
out to be the product HK of the subgroups defined as follows: 


HK ={hk|h€H, ke K}. 


Then HK contains both H and K, and is contained in any such subgroup, but HK 
need not be a subgroup (consider H = {e,r} and K = {e, ro} in S3 with the usual 
notation). However we do have the following result. 


Lemma 2. The following are equivalent for subgroups H and K of a group G: 
(1) HK is a subgroup of G. 
(2) HK = KH. 
(3) KH is a subgroup of G. 


Proof. We prove only (1) + (2); then (1) < (3) follows by interchanging H and K. 

(1) = (2). If kh € KH, then kh = (h-1k-1)-1 € HK by (1). This shows that 
KH CHK. On the other hand, if hk € HK then k-1h7 = (hk)! € HK by (1), 
say k-1h7! = hyky. Hence hk = kj hy’ € KH, so HK C KH. 

(2) => (1). We use the subgroup test. Clearly 1=1-1¢ HK always holds. 
If hk € HK then (hk)"!'=k-'h-+ € KH = HK by (2). Finally, given hk and 
hyky in HK, we have kh; €¢ KH = HK, say khy = hoks. But then it follows that 
(hk) (Aiki) = h(Roke)ki = (hhe)(kek1) € HK, which completes the proof of (1). @ 


A note of caution is needed here: To say that HK = KH does not mean that 
hk = kh for all h € H and k € K. To show that Hk C KH means that, if h € H 
and k € K are given, then hk = k,h, for some hi € H and ky € K. 

If G is abelian then HK is certainly a subgroup by Lemma 2. There is more: 


Theorem 5. Let H and K be subgroups of a group G. 
(1) If H or K is normal in G, then HK = KH. is a subgroup of G. 
(2) If both H and K are normal in G, then HK is also normal in G. 


Proof. (1) Suppose that K is normal in G. If hk € HK then hk = (hkh“')h Ee KH 
because hkh-! € hKh71 = K. Hence HK C KH. The other inclusion is proved the 
same way, so Lemma 2 applies. A similar argument works if H dG. 

(2) If g€G and hk e€ HK, then g1(hk)g = (g-thg)(g *kg) € HK because 
H <G and K 4G. This proves (2). a 


Many groups arise as direct products of groups of smaller order, and the follow- 
ing useful theorem gives an important way to recognize when this is the case. 


126 2. Groups 


Theorem 6. If H < G and K 3G satisfy HN K = {1}, then HK = Hx K. 


Proof. First, HK is a subgroup of G by Theorem 5. Define 
o0:HxK—-HK by o(h,k) =hk forallhe H andke kK. 


We show that o is an isomorphism. Given (h,k) and (h1,ki) in H x K, we have 
hik = kh, by Lemma 1, so o is a homomorphism because 


ol(h, k) iB (hi, k1)] = a(hhi, kky) = hh, kky =hk hyky = o(h, k) : o(hi, ki). 


Since a is clearly onto, it remains to show that o is one-to-one. If o(h, k) = a(hi, ky), 
then hk =hik, so hpth= kk} ¢ HK =({i}. Thus hyth=1=hk-, so 
h=h, and k = ky. But then (h,k) = (hi ki), proving that o is one-to-one. i 


The map o in Theorem 6 is a bijection for any subgroups H and K. Hence 


Corollary 1. IfG is a finite group, and H and K are subgroups with HN K = {i}, 
then |HK| = |H||K\. 


Corollary 2. Let G be a finite group and let H and K be normal subgroups such 
that HN K = {1} and |H||K|=|G|. ThenG2H x K. 


Proof. By Corollary 1, we have |HK|=|H||K|=|G|. Hence G= HK because 
HK CG and G is finite, so Theorem 6 applies. . a 


Examples 7 and 8 below illustrate how to use Theorem 6. It is easy to verify 
that the direct product of two cyclic groups of relatively prime orders is again cyclic 
(Exercise 25 §2.4). Example 7 is the converse. Recall that C,, denotes the generic 
cyclic group of order n. 


Example 7. Let m and n be relatively prime positive integers. If G is a cyclic 
group of order mn, show that G & Cy x Ch. 


Solution. Let G = (a) where o(a)=mn, and write H =(a") and K =(a™). 
Then |H| = o(a") = m and |K| = o(a™) =n, so H =C,, and K &C,,. Moreover, 
HK = {1} by Lagrange’s theorem (Corollary 4). Also, HAG and KdG 
because G is abelian, and |H|-|K| = mn =|G|. Hence G & H x K by Corollary 2 
of Theorem 6, that is G=C,, x C, by Example 13 §2.5. Oo 


The fundamental theorem of finite abelian groups asserts that every finite 
abelian group is isomorphic to a uniquely determined direct product of cyclic 
groups. We prove this assertion in Section 7.2. Example 8 gives a special case. 


Example 8. Let G be an abelian group and assume that |G| = p?, p a prime. Then 
either G = C,2 is cyclic or G & Cy x Cp. 

Solution. Assume that G is not cyclic. Then o(g) =p if 1#g € G because o(g) 
divides p?. Choose a € G with o(a) =p, and write H = (a) &C,. Then H #G, 
so choose b ¢ H, b € G, and write K = (b) = C,. Hence |K| =p too, so we have 
|H||K| = p-p= |G|. Moreover H < G and K 4G because G is abelian. Finally, 
HK = {1} because, otherwise, H = HN K = K by Corollary 3 of Lagrange’s 
theorem. Thus G = H x K by Corollary 2 of Theorem 6, that is G = Cp x Cp. O 


We have already noted (Theorem 2) that every subgroup of an abelian group 
is normal. The converse is not true: A nonabelian group of order 8 exists in which 
every subgroup is normal. It is constructed as follows: Let 


Q = {41, 44, 43, +k} 


2.8. Normal Subgroups 127 


— 


be aset of eight elements with multiplication determined by the following equations: 


2 =i? =k? = ijk = -1, 


a 
ij = b= ~Ji, ( » 
jk=t=—kj, 
k i 
See 


ki = j = —tk. 


Here 1 and —1 multiply as usual, and the multiplication of 7,7, and k is best 
remembered by the diagram above: The product of any two of i,j, and k taken 
clockwise around the circle is the next one, whereas the product counterclockwise 
is the negative of the next one. One realization of Q is in GL2(C) where, if w € C 
satisfies w? = —1, we take 


1= [5 s}t= [5 -w]a=[4 of mde=[ SO]. 


The group @ in the preceding discussion is called the quaternion group, and 
it rules out the converse to Theorem 2. 


Example 9. Show that Q is a nonabelian group in which every subgroup is normal. 


Solution. Q is nonabelian because ij = k while 71 = —k. If a subgroup H contains 
one of +i, +7, or +k, then |H| = 4 or 8 (because these elements have order 4), so 
H <Q by Theorem 4. Otherwise, H C {1,—1} C Z(Q) and again H 4 Q. 0 


Simple Groups 


Lagrange’s theorem shows that the cyclic groups G of prime order have no sub- 
groups except {1} and G. More generally, if G is a group then we say that 


G is simple if G # {1} and the only normal subgroups of G are {1} and G. 


Theorem 7. An abelian group G # {1} is simple if and only if it is cyclic of prime 
order. 


Proof. Suppose G is simple and abelian. Then the only subgroups of G are {1} 
and G (all subgroups are normal). If a € G, a #1, this means that (a) = G. Then 
o(a) # 00 because, otherwise, (a?) does not equal {1} or G. So o(a) is finite, say 
o(a) =n > 2. If pln for some prime p, then (ar/ P\ is a subgroup of order p by 
Theorem 5 §2.4. Hence G = (ar/ P ) is cyclic of prime order. 

The converse is by Lagrange’s theorem. | 


Nonabelian finite simple groups are more difficult to find. We conclude with a 
proof that, although A, is not simple by Example 3, the alternating groups Ap, are 
all simple if n > 5. This has applications in the theory of equations (Chapter 10). 
The proof requires three preliminary results, the first two of independent interest. 


Lemma 3. If o € S, and y = (ki ke «++ kp) is a cycle of length r, then oyo™ is 
also a cycle of length r. In fact, oyo"t = (ok, oka «++ ok,). 


128 2. Groups 


Proof. Write 6 = (ak; oka -+: ok,), Because a is one-to-one, 6 is a cycle of length 
r. We must show that o-yo"+ = 6, that is oy = do, that is o(yk) = 6(ck) for each 
k =1,2,...,n. The reader can verify that (writing k,41 = ky) 

o(yky) = okit, = 5(ok;) for each i = 1,2,...,7. 


But o(yk) = ok = 6(ok) whenever k ¢ {k1, ko,-++ ,k,} because (as o is one-to-one) 
this implies that ok ¢ {ok1,7k2,:+: ,ok,}. This is what we wanted. a 


Lemma 4. Ifn > 2, A, is generated by the 3-cycles. 


Proof. As each 3-cycle is in Ay, it suffices to show that each permutation a € A, 
is either ¢ or a product of 3-cycles. But o is even and so is a product of pairs of 
transpositions. Hence the following formulas complete the proof: 


@ iG Jh=e, (7)tk) =F), and (ig)(k1l)=(GLK)G7R). a 
Lemma 5. Suppose that n > 5. If H < A, and H contains a 3-cycle, then H = Ay. 


Proof. Tf (ij k)¢H we claim that (ajk), (¢ak) and (ij a) are all in H 
for any a ¢ {i,j,k}. Indeed: Since n > 5, choose 6 ¢ {a,i, j,k}. Then we have 
(aj k) =(iab)(ij k)(iab)-! € H by Lemma 3. Similarly, (4 a k), (i j a) € H. 
Let o = (123) € H: by Lemma 4 we must show that every 3-cycle 7 = (i 7 k) is 
in H. Write S = {1,2,3} and T = {i,7,k}. If|SNT|=3, thn S=Tsotr=ceH 
or T=o ' EH. If |SNT|=2, say i=1, 7 =2 and k #3, then (123)EH=> 
T=(12k)eH by the first paragraph. If |SNT|=1, say i=1 and 
{2,3} {j,k} =, then (123)¢ H+ (1j3)€H>7=(1jk) eH. Finally if 
ISAT| =Othen (123)€H>(23)€HS(ij3)eHsr=(ijkK)eH. @ 


Theorem 8. Ifn > 5, the alternating group A, is simple. 


Proof. Let H < An, H # {ce}. Among all elements of H (excluding ¢) let r be one 
that moves the smallest number m of integers. Then m > 3, because 7 € Ap is not 
a transposition. If m= 3 then 7 is a 3-cycle, and we are done by Lemma 5. So 
assume m > 4; we show that this leads to a contradiction. Factor 7 into disjoint 
cycles and consider two cases. 


¢ Case 1:7 contains a cycle of length > 3, sayT = (123 ---)y2°-+-Yr. If fT moves 
exactly 4 integers, then r = (123k) is odd. So assume that 7 moves (say) 
4 and 5, as well as 1,2, and 3. Let 8=(3 4 5) and write 7) =7 'A7671. 
Then 7, € H because H <j An, and 7) # € because 712 = 77 '4 # 2. Moreover, 
if k > 5 is fixed by 7, then k is also fixed by 7, (because Gk = k). Hence if 74 
moves k > 5, then 7 also moves k. But 7, fixes 1, whereas 7 does not. Thus 7; 
moves fewer elements than 7, a contradiction. 

e Case 2: t is a product of disjoint transpositions, say, T = (1 2)(3 4)---. As 
before, let 6 = (345) and 7; = 7~*876-!. Now 7; fixes 1 and 2 and any in- 
teger k > 5 that is fixed by 7. Because 7; # € (for example, 715 = 3), this is a 
contradiction, as in Case 1. A 


Other infinite families of finite simple groups exist (in addition to the alternating 
groups A,, 2 > 5). The complete classification of these groups was first given in 
1981. This was the culmination of more than 30 years of effort by hundreds of 
mathematicians, yielding thousands of pages of published work. It is certainly one 
of the greatest achievements of twentieth-century mathematics. One spectacular 


2.8. Normal Subgroups 129 


landmark came in 1963 when J. Thompson and W. Feit proved** a long-standing 
conjecture of William Burnside that every finite nonabelian simple group has even 
order (the proof is more than 250 pages long!). Thompson went on to publish the 
“N-group” paper in which he introduced many fundamental techniques, and which 
has been called the single most important paper in simple group theory.** Then in 
the 1970s, M. Aschbacher carried the work forward in a series of papers, building 
on the methods of Thompson. The main difficulty was the existence of sporadic 
finite simple groups not belonging to any of the known families. R. L. Griess finally 
constructed the largest of these, called the monster (the order is approximately 
8 x 105%). The complete classification encompasses several infinite families of finite 
simple groups and exactly 26 sporadic groups.*4 


Exercises 2.8 


1. Consider D2 = {1,a,:-- ,a!,b,ba,--- ,bal}, where o(a) =12, 0(b) =2, aba = b. In 
each case show that H is a subgroup of Dj. and determine if H <1 Dyo. 
(a) H = {1, 0°, b, ba®} 
(b) H = {1,a*,a®, b, ba*, ba®} 
(c) H = {1,a?, a4, a®, a®, a?°, b, ba?, bat, ba®, ba®, bal} 
2. Find all normal subgroups of D4. [Hint: Exercise 7 below.] 
3. Let K = {e, (1 2)(3 4), (13)(2 4), (1 4)(2 3)}. Show that K is the only normal sub- 
group of A, apart from A, and {e}. [Hint: Exercise 34 §2.6.] 
4. If Da= {1,a,a”, a,b, ba, ba”, ba®}, K ={1,b} and H = {1,a’,b,ba?} show that 
K 3H <4 Dg, but KAD,. 
5. If K dH and H 4G, show that aKa < H for alla € G. (See Theorem 5 §2.3.) 
6. Let H be a subgroup of a group G. If for each a € G there exists b € G such that 
aH = Hb, show that H aG. 
7. If H AG and |A| = 2, show that H C Z(G). Is this true when |H| = 3? 
8. If H is a subgroup of G and K 4G, show that HN K «aH. IHnK «aK? 
9. Given a group G, let D= {(g,9) |g € G}. Show that D is a normal subgroup of 
G x G if and only if G is abelian. 
10. Let N dG and K 4G. Show that NOK 4G. 
11. Let p and q be distinct primes. If G is a group of order pq that has a unique subgroup 
of order p and a unique subgroup of order q, show that G is cyclic. [Hint: Corollary 
2 of Theorem 6 and Exercise 25 §2.4.] 
12. Let K < G where K is cyclic. Show that every subgroup of K is normal in G. 
13. Let X be a nonempty subset of a group G. 
(a) If G = (X) (see Theorem 10 § 2.4) and H is a subgroup of G, show that HG 
if and only if 2«-'Ha C H forallze X. 
(b) Show that (X) is normal in G if and only if gXg7 C (X) for allg EG. 


32Thompson, J. G., and Feit, W., Solvability of Groups of Odd Order, Pacific Journal of Mathe- 
matics, 13 (1963), 775-1029. 

33 John Thompson was awarded the Fields Medal in 1970, the highest honor a mathematician can 
receive. 

84More information can be found in Chapter 17 of “Finite Groups” by D. Gorenstein, Chelsea, 
1980. 


130 


14. 


15. 


16. 
17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


2. Groups 


If G=HxK is finite, find H, dG and Ki <G such that H, 2H, Ki 2K, 
A, K, = {1}, and |G| = |Hi|-|Ki|. (Converse of Theorem 6.) 

Let K be a subgroup of G of index 2. : 

(a) fae GK and be Gw~ K, show that abe K. 

(b) If H is a subgroup of G and H ¢ K, show that |H: HN K| = 2. [Hint: If ho € 
H ~\ K, show that h> hho is a bijection HN K > HN (HN K),] 

Show that inn G <j aut G for any group G. 

Let D, = {1,a,--- ,a"1,b, ba,--- ,ba"-"} with o(a)= n, o(b) = 2, and aba = b. 

(a) Show that every subgroup K of (a) is normal in D,. 

(b) If n is odd and K < D,, show that K = D, or K C (a). 

(a) Let Q denote the quaternion group. If a =i and b= 7 show that @ has the form 
Q = {1,a,a?, a3, b, ba, ba”, ba?} where o(a)= 4, aba = b, and b? = a?. Show further 
that these conditions determine the Cayley table of Q. 

(b) If G is a nonabelian group of order 8, show that G& D4 or G& Q. (Hint: See 
Theorem 3 §2.6; use Theorem 4 and (a).] 

If H and K are subgroups of G, show that HK is a subgroup if and only if HK C KH, 
if and only if KH C HK. 

If G = HK where H and K are subgroups such that hk = kh for allh € Handke kK, 
show that H<« Gand K aG. 

If HCG and K CG are subgroups with HK = KH, show that HK =(HUK). 
(See Theorem 8 §2.4.) 

Let G be a group with |G| = mn where m and n are relatively prime, and let H and 
K be subgroups where |H| = m, |K| =n. If hk = kh for all h € H and k € K, show 
that G2 Hx K. 

(a) Let n = 2m, where m is odd. Show that D, ~ Cy x Dm, where C2 is cyclic of 
order 2. [Hint: Corollary 2 of Theorem 6.] 

(b) Is Dig & Cg x Dg? Justify your answer. 

A subgroup H of a group G is called a characteristic subgroup if o(H) C # for all 
automorphisms o of G. 

(a) Show that every characteristic subgroup is normal. 

(b) Show that if H is characteristic in G then o(H) = H for all o € autG. 

(c) If G= C2 x Cp show that H = C2 x {1} is normal in G but not characteristic. 
[Hint: Consider o : G > G given by o(2,y) = (y,z).] 

(d) Show that the center Z(G) is characteristic in G. 

(e) If H C K dG and F is characteristic in K, show that H 4G. 

(f) If K is characteristic in H and H is characteristic in G, show that K is charac- 
teristic in G. (Compare with Exercise 4.) 

(g) Show that every subgroup of a cyclic group G is characteristic in G. Is this true 
if G is merely abelian? 

(h) If H and K are characteristic in G, show that HM K is characteristic in G. 

(i) If H is a subgroup of G, let K = {g € G| g € o(H) for allo € aut G}. Show that K 
is characteristic in G, that K C H, and that K contains every characteristic subgroup 
of G that is contained in H. 

If X is a nonempty subset of a group G, define the normalizer N(X) of X by 
N(X) ={a€G| axa} = X}. 

(a) Show that N(X) is a subgroup of G. 

(b) If H is a subgroup of G, show that H a N(H). 


2.9. Factor Groups 131 


(c) If H is a subgroup of G, show that N(H) is the largest subgroup of G in which 
H is normal. That is, if H < K, and K is a subgroup of G, then K C N(H). 

26. If H is a subgroup of G, define the core of H, denoted core H, to be the intersection 
of all the conjugates of H in G; that is, 


coreH = {g €G|g € aHa” for alla€e G} =(\{aHa™ | ae Gh. 


(a) Show that coreH < G and coreH C H. 
(b) Show that core H is the largest normal subgroup of G that is contained in H; 
that is, if kK d Gand K CH, then K C coreH. 
(c) Show that core(H MK) = core H M core K for all subgroups H and K. 

27. If X is a nonempty subset of a group G, define the normal closure X of X to be 
the intersection of all normal subgroups of G that contain X; that is, 


X={g9¢€G|geN forall NIG, XCN}J=(U{N|XCN AG}. 


(a) Show that X dGand X CX. 
(b) Show that X is the smallest normal subgroup of G that contains X; that is, 
X CN and N <G implies that X CN. 
(c) Show that HN K C HK for all subgroups H and K of G, and that this need 
not be equality. 

28. If X is a nonempty subset’ of a group G, define the centralizer C(X) of X by 
C(X) = {c€ G | cx = ae for all x € X}. Note that C(G) = Z(G). 
(a) Show that C(X) is a subgroup of G. 
(b) If K <4 G, show that C(K) 4G. 


2.9 FACTOR GROUPS 


If n > 2, recall the construction of Z, in Section 1.3. Given the subgroup nZ of 
(Z,+), the set Z, consists of all “residue classes” @= {xz € Z| xz =a (mod n)} 
where a € Z. These classes are really cosets @ = nZ-+ a. Moreover, we defined ad- 
dition in Z, by @+b=a+ 5; that is, 
(nZ +a) + (nZ+b) =nZ+ (a+b). 
This suggests a general definition: If kK is a subgroup of a multiplicative group G, 
we could define an analogous multiplication on the set of right cosets by 
KakKb=Kab forall a,beG. (*) 
However, this may not make sense for some subgroups K because cosets can have 
different generators: Ka = Ka, can happen where a and a; may not be equal. 
More precisely, let « = Ka = Ka, and y = Kb = Kb, be cosets. If we multiply 
z= Kaandy = Kb using (*) we get zy = Kab, but if we view x andy as v7 = Kay, 
and y = Kb; we obtain zy = Ka,6,. Clearly, what is needed is: 


If Ka = Kay and Kb= Kb, then necessarily Kab= Kayzb. 


In this case we say that the multiplication Ka Kb = Kab is well defined. This 
condition on K is equivalent to K being normal in G. 


Lemma. The following conditions are equivalent for a subgroup K of G. 
(1) K is normal in G. 
(2) Ka Kb = Kab is a well defined multiplication of right cosets. 


132 2. Groups 


Proof. (1) > (2). If K 4G, let Ka = Ka, and Kb = Khy, that is aay! € K and 
bby* € K. We must show that Kab = Kay), that is ab(aib,)~+ € K. Compute 


ab(a,bi)~+ = ab(by*ay*) = a(bby*)az = [a(bby*)a~*](aaz*) € K 


because aKa™! C K. This is what we wanted. 

(2) = (1). If a€G we must show that aka~+ € K for all ke K. Clearly 
Ka= Ka and Kk = K1, so applying (2) gives Kak = Kal, that is Kak = Ka. 
But then (ak)a~ € K, as required. | 


Theorem 1. Let K < G and write G/K = {Ka| a € G} for the set of cosets. 
(1) G/K is a group under the operation Ka Kb = Kab. 
(2) The mapping yp: G — G/K given by (a) = Ka is an onto homomorphism. 
(3) If G is abelian, then G/K is abelian. 

(4) If G = (a) is cyclic, then G/K is also cyclic; in fact, G/K = (Ka). 

(5) If|G: K| is finite then |G/K| = |G: K]; if |G| is finite then |G/K| = rar 


Proof. (1) The operation on G/K is well defined by the Lemma. The unity of 
G/K is K = K1 because KaK1= Ka=K1Ka for all Ka in G/K. We have 
Ka Ka" = K1= Ka" Ka, so the inverse of the coset Ka is (Ka)! = Ka™}. 
Finally, associativity in G/K is inherited from G: 


Ka(Kb Kc) = Ka Kbc = Ka(bc) = K(ab)c = Kab Kc = (Ka Kb) Ke 


(2) We have y(a) p(b) = Ka Kb = Kab = ¢(ab) for all a,b € G, so ¢ is a homo- 
morphism. It is clearly onto. 

(3) If G is abelian, Ka Kb = Kab = Kba = Kb Ka, proving (3). 

(4) Let G = (a) = {a* | k € Z}, so every coset in G/K has the form Ka* for 
some integer k. If y is the map in (2), then Ka* = y(a*) = y(a)* = (Ka)* by 
Theorem 1 §2.5. It follows that G/K = (Ka), as required. 

(5) As |G: K| is finite, |G/K|=|G: K| is the definition of the index |G’: K]. 
If |G| is finite, then |G/K| = |G|/|K| is (2) of Lagrange’s theorem. a 


Thus, ifn > 2in Z then Z/nZ = Z, as additive groups. If K is anormal subgroup of 
a group G, write G/K = {Ka|a€ G} as in Theorem 1. We make two definitions: 


The group G/K of all cosets of K in G is called the factor group of G by K. 
The homomorphism y : G > G/K where (a) = Ka is called the coset map. 


Hence the unity of G/K is K = K1, and inverses are given by (Ka)~! = Ka™?. 
It is important for a student of group theory (and ring theory for that matter) to 
develop skill in working with factor groups. All the group theoretic techniques we 
have developed up to now apply to these groups; the only new aspect is that the 
elements are now cosets. 


Example 1. If G is a group then we always have G 4 G and {1} <G. 
If K = G there is only one coset, so G/K = {G} is the group with one element. 


2.9. Factor Groups 133 
If K = {1}, then Ka = {a} for each a € G, so G/K is the set of all singleton 
subsets of G. The operation is {a}{b} = {ab}, so G/K & G in this case. 


Example 2. Let G = (a) where o(a) = 12, and let K = (a*). Find all the cosets 
in G/K and write down the Cayley table. 


Solution. Note first that K <1 G because G is abelian. The cosets are 
Ketel. Ka=fae",0'}. Ka? ={e7,0°.0").. Kat =fo%a oa") 


Two computations are needed to fill in the Cayley table: Ka. Ka® = K (because 
a* € K) and Ka? . Ka? = Ka® = Ka (because a> € Ka). Then 


G/K | K Ka Kad Ka? 
K K Ka Ka Kae 
Ka | Ka Ka? Ka K 
Ka? | Ka Ka K Ka 
Ka | Ke K Ka Kea? 


We have G/K = (Ka) as Ka? =(Ka)?, Ka® =(Ka)’, and K = Kat = (Ka)*. 
This confirms Theorem 1(4) in this case. C 
Example 3. Let K = {e, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}. Show that 
K <1 Aa, find all the cosets in A4/K, and write down the Cayley table. 
Solution. We showed that K < A, in Example 3 §2.8. The cosets are 
Ke =eK ={e, (1 2(3 4), (1 3)(2 4), (1 4)(2 3)} 

KL 2) 8p] 2- 3K ={0.- 2-3), (2.4.3), 4-2), 3.4)} 

KOA-3: 2) (le 2K H{(b 3. 2). -43),.(2 3 4), 2 4} 
The Cayley table is as shown. 


Ag/K K K(1 2 3) K(1 3 2) 

K K ae 23) K(1 3 2) 
Rl 2 3) | KM 273): KM 89) K 

K(i 3 2) | K(1 3 2) K K(1 2 3) 


Here the fact that K(1 3 2) =[K(1 2 3)]? shows that G/K =(K(1 2 3)) is 
cyclic. Of course, this also follows from the fact that |G/K| = 3 is prime. O 


Example 4. Consider the octic group D4 = {1,a,a”,a?,b, ba, ba, ba*}, where 
o(a) = 4, o(b) = 2, and aba = b. Show that Z(D4) = {1,a7} and that D4/Z(Da) 
is isomorphic to the Klein group K4. 


Solution. Write Z = Z(D4) for short. We have a(ba*) = ba*t3 + bakt! = (ba*)a for 
each k, so ba’ ¢ Z. Similarly, ab = ba? # ba and a*b = ba $ ba® show that a ¢ Z 
and a® ¢ Z. Hence Z C {1,a7}. However, a*b = ba”, so a? commutes with both 
generators a and b of D4. This implies that a? € Z, so Z = {1, a7}. 

Of course, Z = Z(D4) is normal in D4, and the cosets are 


Z={1,a7}, Za={a,a3}, Zb={b,ba*}, and Zba = {ba, ba}. 
Thus D4/Z = {Z, Za, Zb, Zba}, and the Cayley table is as shown. 


This is evidently the noncyclic group of order 4, that is D4/Z = Ka. C 


Example 5. Let G = (a), where o(a) = 18, and let K = (a®). Find the order of 
the element Ka® in G/K. 


Solution 1. As K = {1,a°,a*}, we have |G/K| = |G|/|K| = 18/3 = 6. Then, from 
Lagrange’s theorem, the order o(Ka*) of Ka® is 1,2,3, or 6. Now 

Ka’ + K, (Ka®)?= Kal 4K, and (Ka®)?= Ka 4K. 
Hence o(Ka*) is not 1,2, or 3, so it must be 6. 


Solution 2. G = (a®) because gcd(5, 18) = 1, so G = (a®). Hence G/K = (Ka®) by 
Theorem 1. Because |G/K| = 6, this means that o(Ka*) = 6. 0 


The next theorem provides a useful method of proving that a group is abelian. 
We include it for reference later. 


Theorem 2. Suppose a group G has a subgroup K C Z(G) such that G/K is 
cyclic. Then G is abelian. 


Proof. Let G/K = (Kq) . If a,6 € G, this means that we can write Ka and Kb in 
the form Ka = Kg™ and Kb = Kg”. Thus a = kg™ and b = kyg” where k and ky 
are in K (and hence are central in G by hypothesis). But then 


ob= (ke™)(kyg") = Rig’ = kiko ™ = Cho" (kg) = ba; 
This shows that G is abelian. | 


The Derived Subgroup 


If H < G there is a useful test for determining when a factor group G/H is abelian. 
To motivate it, consider the following way of deciding that two cosets Ha and Hb 
commute: 


HaHb=HbHa = Hab=Hba © ab(ba)1€H © abatbted. 


With this in mind, an element in a group G of the form aba-1b"? is called a 
commutator and is denoted 


[a,b] =aba-'b-!, for any a,b in G. 
Hence, if H < G then G/H is abelian if and only if H contains every commutator. 
The commutator subgroup or derived subgroup G’ of G is defined by 
G' = {all finite products of commutators from G}. 


To see that G’ really is a subgroup of G, note that 1 = [a, a] is in G’ and that G’ is 
clearly closed under the operation of G. The fact that G’ is closed under inverses 
follows from the first of the following easily verified properties of commutators. 


2.9. Factor Groups 135 


(1) [a, 6]-* = (8, a]. 
(2) gla, blg7! = [gag™!, gbg~'] for all g € G. 


These facts reveal the relationship between G’ and the abelian factor groups of G. 


Theorem 3. Let G be a group and let H be a subgroup of G. 
(1) G" is a normal subgroup of G and G/G’ is abelian. 
(2) G’ CH if and only if H is normal in G and G/H is abelian. 


Proof. We have already established that G’ is a subgroup of G. Since (2) + (1) 
by taking H = G’, we prove only (2). If H dG, the above argument shows that 
G/H is abelian if and only if every commutator belongs to H, that is, if and only 
if G’ C H. Hence it remains to show that G’ C H implies that H a G. If G’ C H, 
let g € Gandheé H. Then 


ghg-! = (ghg*h")h=[g,hlh Ee GAC Hh=H. 
Thus gHg™! C H, so H < G as required. al 


Hence G’ = {1} if and only if G is abelian. Since G is abelian if and only if 
Z(G) = G, this contrasts the way G’ and Z(G). measure the commutativity of the 
group G. 

Theorem 3 asserts that G’ is the smallest normal subgroup H of G with the 
property that the factor group G/H is abelian. This fact can be very useful in 
computing G’, as Example 6 illustrates. 


Example 6. Compute D',, where Ds = {1,a,a”,a°, b, ba, ba”, ba*}, where o(a) = 4, 
o(b) = 2, and aba = b. 


Solution. In Example 4 we showed that the center of Dy is Z = {1,a7} and that 
D4/Z is abelian. Hence D/, C Z by Theorem 3 and so, because |Z| = 2, either 
Di, = {1} or Di, = Z. But Di = {1} is impossible because Dy/{1} & Dg is not 
abelian. Hence D/, = Z = Z(Da). oO 


Exercises 2.9 


1. In each case find the cosets in G/K, write down the Cayley table of G/K, and describe 
the group G/K. 
(a) G= Dg and K = Z(Dg) 
(b) G=Q and K = Z(Q) 
(c) G= Ax B, A and B arbitrary groups, and K = {(a,1) | a € A} 
(d) G = (a) x (b), where o(a)= 8 and o(b)= 2, and K = ((a?,b)) 
2. An integer n is called an exponent for a group G if g" = 1 for every g in G. If n is 
an exponent for G, show that it is an exponent for every factor group G/K. 
8. If G = (a), o(a)= 24, let K = (a?) and H = (a°). 
(a) In G/K, find the order of the elements Ka?, Ka°, Ka*, and Ka’. 
(b) In G/H, find the order of the elements Ha?, Ha*, Ha*, and Ha’. 
4, Let G = (a) x (b), where o(a)= 8 and o(b)= 12. 
(a) If K = ((a,6°)), find the order of K(a*,b) in G/K. 
(b) If K = ((a,b?)) , find the order of K(a?,b) in G/K. 


136 


12. 


13. 


14. 


15. 


16. 


17. 


18. 
19. 
20. 
21. 


22. 
23. 


24, 
25. 


26. 


2. Groups 


Let G = Dy. = (a,b), where o(a) = 6, o(b) = 2, and aba =b. 
(a) If K = (a), find the order of Ka®, Ka®, Ka®, and Kba in G/K. 
(b) If K = (a3,b), find the order of Ka?, Ka5, and Kba? in G/K. 


. If @ denotes the quaternion group, show that Q/Z(Q) has order 4. Is it cyclic or 


isomorphic to the Klein group? Support your answer. 


. Show that Q/Z is an infinite abelian group in which every element has finite order. 
. Let K CH CG be finite groups, with K < G. Show that H/K ={Kh|he H} isa 


subgroup of G/K, and |G/K : H/K|= |G: H|. 


. If K dG and o(g)= n, g € G, show that the order of Kg in G/K divides n. 
. If K dG has index m, show that g” € K for allg eG. 
. If K dG has index m and if gcd(m,n) = 1, show that K contains every element of 


G of order n. 

Let G be a finite group and let K < G. If G/K has an element of order n, show that 
G has an element of order n. 

Let K <1 G. In each case, if both K and G/K have the given property, show that G 
also has the property. 

(a) Trivial center. 

(b) Every element has finite order. 

(c) Every element has order a power of a fixed prime p. 

(d) Finitely generated. 

If K < G has prime index p, show that G= K UKaU::-U Ka?! is a disjoint union 
for some a € G. 

If G=(X) is generated by X, and if K <G, show that G/K is generated by 
{Ka |x eX}. 

Let H be a subset of G that is closed under the group operation. If g? € H for all 
g € G, show that H is a normal subgroup of G and G/H is abelian. 

If G is abelian, let T(G) denote the set of elements in G of finite order. 

(a) Show that T(G) is a subgroup of G—the torsion subgroup. 

(b) Call G a torsion-free group if T(G) = {1}. Show that G/T(G) is torsion free. 
(c) Call G a torsion group if T(G) = G. If H is a subgroup of G, show that G is a 
torsion group if and only if both H and G/H are torsion groups. 

Let KCHCG be groups, where K<dG and |G:K| is finite. Show that 
|G/K : H/K| is also finite and that |G/K : H/K|=|G: H}. 

Find G’ in each case. 

(a) G is abelian (b)G=Q (c) G= De (d) G=S, 

Show that G’ is a characteristic subgroup of G for every group G. 

Show that (G x H)'=G'x H'. 

If H is a subgroup of G, show that H’ C HG". Show that this may not be equality. 
Let K dG. 

(a) If K C H where H is a subgroup of G, show that H/K is a subgroup of G/K. 
(b) If ¥ is a subgroup of G/K, show that X= H/K where H={heEG| Khe A} 
is a subgroup of G containing K. 

If K dG and KNG' = {1}, show that K C Z(G) and that Z[G/K] = Z(G)/K. 
Let kK AG. 

(a) Show that [a, Kb] = K[a, }] for all a,be G. 

(b) If K C G', show that (G/K)! = G'/K. 

Let K C Z(G) be a subgroup such that G/K = (Ka,--.,Ka,) where a,x; = 2; 2; 
in G for all 7 and 7. Show that G is abelian. (This extends Theorem 2.) 


2.10. The Isomorphism Theorem 137 


27. Let K C H CG be groups with K characteristic in G. If H/K is characteristic in 
G/K, show that H is characteristic in G. [See Exercise 24 §2.8.] 
28. (a) Show that |G: Z(G)| cannot be a prime for any group G. 
(b) Show that G = Ds is a nonabelian group G such that G/Z(G) is abelian. 
29. If k|n, k > 2, show that D, has a normal subgroup K such that D,/K © Dy. [Hint: 
If D, is generated by a and b where o(a)= n, o(b)= 2, and aba = b, take K = (a*) ] 
30. If K ={e, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}, show that S,/K & D3. [Hint 
Exercise 3 §2.8.] 
31. Let G= C4xCsg. 
(a) Find subgroups H and K of G such that H & K but G/H #G/K. 
(b) Find subgroups P and Q of G such that G/P = G/Q but P#Q. 


2.10 THE ISOMORPHISM THEOREM 


There is a connection between normal subgroups, homomorphisms and factor 
groups. The main relationship between these concepts is embodied in the isomor- 
phism theorem, which is the principal result in this section and one of the most 
useful theorems in group theory. To describe it, we begin by identifying two sub- 
groups associated with a group homomorphism a: G — H. The first is 


The image of a, defined by ima =a(G) = {a(g) | 9 € G}. 


This is a subgroup of H as was shown in Corollary 2 of Theorem 1 §2.5. We now 
turn to a subgroup of G determined bya:G—- H: 


The kernel of a, defined by kera = {k € G | a(k) = 1}. 
Theorem 1. Let a: G— H be a group homomorphism. 

(1) a(G) is a subgroup of H. 

(2) ker @ is a normal subgroup of G. 


Proof. (1) This is Corollary 2 of Theorem 1 §2.5. 
(2) We have 1 € kera because a(1) = 1. If k,k’ € kera, then 


a(kk’) = a(k)-a(k')=1-1=1 and a(k-+)=a(k)t=17=1. 
Hence kk’ € kera and k7! € ker a, so kera is a subgroup. If g € G andk € K then 
a(gkg-!) = a(g) - ak) - a(g7) = ag) -1- (9)! = 1. 
This shows that g(ker a:)g~! C ker a for all g € G, and so proves that kera 4 G.@ 


Note that the image of a homomorphism a: G —- H need not be normal in H. For 
example, if K is any subgroup of H, define the inclusion mapping 1: K — H by 
wk) = k for all k € K. This is a one-to-one homomorphism, but 1(/) = K need not 
be normal in H. 

Theorem 1 shows that kernels of homomorphisms from G are normal in G. 
Conversely, every normal subgroup of a group G arises as the kernel of some homo- 
morphism with G as domain: 


Theorem 2. If K < G, then K = kery where py: G— G/K is the coset mapping. 


138 2. Groups 


Proof. The coset map y is defined by y(g) = Kg for all g € G and is a homo- 
morphism by Theorem 1 §2.9. Because K is the unity of the group G/K, we have 
g € ker if and only if Kg = K, if and only if g € K. Hence kery = K. | 


Many important subgroups are kernels of naturally occurring homomorphisms; 
indeed, the easiest way to verify that a subgroup of a group G is normal in G is 
often to exhibit it as the kernel of a homomorphism with G as domain. 


Example 1. The absolute value homomorphism C* — R* given by z+ |z| has 
kernel the circle group C° = {z € C* | |z| = 1}. 

Example 2. The kernel of the determinant homomorphism At det A from 
GL, (R) — R* is the special linear group SL,(R) = {A € M,(R) | det A = 1}. 
Example 3. If G is a group and g € G has finite order n, let a: Z— G be the 
exponent mapping given by a(k) = g*. Then kera = nZ by Theorem 2 §2.4. 
Example 4. Show that A, <1 S;, by exhibiting A, as a kernel. 


1 if o is even 
—1 if ois odd 
Then the sign mapping a : S,, — {1, —1} given by a(o) = sgna is a homomorphism 
(see Exercise 29 §1.4). Clearly kera = Ap. 0 


Solution. Define the sign of a permutation o € S, by sgng = 


Example 5, The trivial homomorphism G — H is the only one with G as kernel. 


It is clear that a homomorphism a: G — H is onto if and only if a(G) = H, 
that is, if and only if the image a(G) is as large a subgroup of H as possible. The 
next theorem shows that a is one-to-one if and only if ker a is as small as possible. 


Theorem 3. Ifa:G-— H is a homomorphism, then a is one-to-one if and only if 
ker a = {1}. 


Proof. If a is one-to-one, let g € kera. Thus a(g) =1=a(1), so g =1 because 
a is one-to-one. Hence kera = {1}. Conversely, let ker a = {1} and suppose 
that a(a) = a(b) where a and b are in G. Then a(ab-!) = a(a)a(b)! = 1, so 
ab“! € kera = {1}. This shows that ab-1 =1 and hence that a = b. Thus a is 
one-to-one. fl 


Theorem 3 is used frequently to test when a homomorphism is one-to-one. 
We now come to one of the most useful theorems in group theory. 


Theorem 4. Isomorphism Theorem*’. Let a: G— H be a group homomor- 
phism and write K = kera. Then 


a(G) = G/kera. 
Proof. Write K = ker a for simplicity, and define 
a:G/K — a(G) by &(Kg) = a(g) for all Kg € G/K. 
First @ is well defined; that is, Kg = Kg; implies that a(g) = a(g1). In fact, 
Kg=Kg @ ggy' €K & a(ggy*)=1 & ag) =a(gr). 


*5This result goes back to Camille Jordan (1838-1922) in his book Traité des Substitutions (1870), 
where the concept of a homomorphism was introduced. 


2.10. The Isomorphism Theorem 139 


Hence @ is well defined (=) and one-to-one (<=). As @ is clearly onto a(G), it 
remains to show that it is a homomorphism. But 


a(Kg Kg1) = a(Kgg1) = a(gg1) = a(g) - a(91) = a(Kg) - a(Kg1) 
holds for all Kg and Kg; in G/K. | 
If G is a group, a group of the form a(G) where a: G — H is some homomor- 
phism is called a homomorphic image of G. Hence the isomorphism theorem 


shows that the factor groups of G and the homomorphic images of G are the same 
set of groups up to isomorphism. 


Remark. The diagram to the right depicts the mappings G 
a and @ in the isomorphism theorem. Here K = kera o| ys 

as in the theorem, and the mapping y:G—>G/K is os 

the coset mapping. Note that a = dy is a factorization G/K 

of the (arbitrary) homomorphism a as a composite where y is onto and @ is one-to- 
one. Indeed, &y(g) = &[y(g)] = @(Kg) = a(g) for all g € G. Moreover, & is the only 
homomorphism G/K — H with the property that Gy = a. Indeed, if this condition 
holds then the action of @ is determined: @(Kg) = G[y(g)| = av(g) = a(g) for all 
Kg in G/K. Hence: 


Corollary. Let a: G— H be a group homomorphism. Then a factors uniquely as 
a= ap where py: G— G/ker a is the coset map, and &@: G/ ker a — H is defined 
in Theorem 4. Note that y is onto, and & is one-to-one. 


The isomorphism theorem is a marvelous result. It sheds light on nearly every 
situation to which it is applied. It is used as follows: If we want to show that 
G/K = H, we find an onto homomorphism G — H with kernel K. As a bonus, the 
fact that K is a kernel automatically proves that it is normal in G. Examples 6-9 
illustrate the use of the isomorphism theorem. 


Example 6. If G is a cyclic group, show that G& Zor G2 Z,. 
Solution. Let G = (a) and define a: Z — G by a(k) = a* for all k € Z. This is an 
onto homomorphism and ker a = {k | a* = 1}. If o(a) is infinite, kera = {0} and 


the isomorphism theorem gives G = Z/{0} = Z. If o(a) =n, then kera = nZ and 
G2Z/nZ =Zy. O 


Example 7. Let K a G and K, <1 G,. Show that (K x Ky) < (Gx G4) and 
(G x Gi)/(K x Ky) © (G/K) x (Gi/K4). 
Solution. We define a : (G x G1) > (G/K) x (Gi/K1) by a(g, 91) = (Kg, Kigi). lt 


is routine to verify that this is an onto homomorphism, and kera = K x K,. The 
isomorphism theorem now gives all our assertions. Oo 


Example 8. Show that R/Z&C° where C° = {e% | 6 € R} is the circle group. 
Solution. We define a : R — C° by a(x) = e?"**. We have 


a(a hs y) = e2t(aty)i — p2rxiprryi _ a(a) : a(y) 


140 2. Groups 
so a is a homomorphism. It is clearly onto, and 
a(a)=1 eet 1 SS 2 'e Z. 


Thus, ker = Z and the isomorphism theorem does the rest. O 


If we are interested in determining all homomorphisms a: G — Gj, the fact 
that a(G1) is isomorphic to G/(ker a) is useful because sometimes we can deter- 
mine the normal subgroups of G. In Example 9 §2.5, we showed that there are at 
most six homomorphisms: $3 — Cg, and hence at most 6 from D3 — Cg. Using the 
isomorphism theorem, we can show that in fact there are only two. 


Example 9. Write D3 = {1,a, a”, b, ba, ba”}, where o(a) = 3, o(b) = 2, and aba = 8, 
and write Cg = (c) , where o(c) = 6. Show that there are only two homomorphisms, 
D3 — Cg, the trivial one and 


a:D3-+Cg defined by a(bta™) =c3* forall b*a™ € D3. 


Solution. We know that D3 has only three normal subgroups: {1}, D3, and K = (a). 
- Thus ifa: D3 — Cg, is a homomorphism, ker a must be one of them. It is impossible 
that ker a = {1} because then a(D3) = D3 would be a nonabelian subgroup of C¢. 
If ker a = Dg then a is the trivial homomorphism. So assume that kera = K = (a). 
In this case, let y : D3 > D3/K be the coset map. If a exists, the corollary to the 
isomorphism theorem guarantees that a= op, where a: D3/K — a(D3) is an 


isomorphism. In this case D3/K = {K, bK} is cyclic of D3 —“+ a(Ds) 
order 2, so a(D3) is the (unique) subgroup of order 2 

in Cg; that is a(D3) = {1,c?}. Clearly, o(K) =1 and e oe 
o(bK) = c?, so a = oy is given by Ds/K 


a(bka™) = oy(b'a™) = o(b¥a™ K) = o(bK)* -o(aK)™ =(c3)F 1% =*, O 


We conclude this section with one more result using the isomorphism theorem. 
Recall (Example 18 §2.5) that the set inn G of all inner automorphisms of a group 
G is a subgroup of the group aut G of all automorphisms of G. 


Theorem 5. If G is any group, then G/Z(G) & innG. 
Proof. If a€é G, recall that the inner automorphism og:G-—-G is defined by 
0a(g) = aga? for all g € G. Then o,0y = oq for all a,b € G (Example 17 §2.5), and 


so 6(a) = oq defines a group homomorphism 6 : G > aut G. Clearly, 0(G) = innG, 
and 


kerO6 = {a€Gloa=1e}={a€G| aga! =g for allg Ee G} = Z(G). 
The result now follows from the isomorphism theorem. 
Example 10. Show that inn $3 = S3. Show further that inn $3 = aut S3. 


Solution. Z(S3) ={e} is easily verified, so S3=innS3 by Theorem 5. Hence 
jinn S3| = 6 so, since innS3 C aut.S3, it suffices to show that jaut S3| <6. But 
S3 = {1,0,0°,7,70,T07} where o(c) =3 and o(r) =2. So if @: $3 > $3 is an 


2.10. The Isomorphism Theorem 141 


automorphism then 0(6(c)) = o(a) = 3, so 0(¢) =o or o”. Similarly 6(7) = 7,70 


or Td 


2 so there are at most 2-3 = 6 choices for 6 because S3 = (0,7). im 


- 


6 


Exercises 2.10 


1. 


14. 
15. 


b 
Let e={]§ | a,b,c€R,a#0,c% 0} and K={|) r|[eeR}, subgroups 
of GL,(R). Show that K<aG by exhibiting K as the kernel of a group 
homomorphism G —R* x R*. 


. Show that the following are equivalent for a group homomorphism a: G - G). 


(a) a@ is trivial (b) kea=G (c) a(G) = {1} 


. Let H be a subgroup of G with |G: H|=2, and define a: G- {1,—1} by 


1, ifacH 
a(a) = : . Show that a is a homomorphism and that kera = H. 
-l, ifa¢gH 


. If a: GG, is a group homomorphism and if X is a subgroup of a(G), the 


preimage of X under a is defined by a7}(X) ={g €G| a(g) € X}. For example 
a~!({1}) = ker a. [Note: The notation a7! here is not intended to imply that a is an 
isomorphism. ] 

(a) Show that a71(X) is a subgroup of G, normal if X <4 a(G). 

(b) Show that X CY if and only ifa7}(X) Cah(Y). 

(c) Show that a1(X NY) =a7l(X)na (VY). 

Let pm : G — G be the m-power map: pm(g) = g™. Assume G is abelian and |G| = n. 
(a) Show that ker pm = {g | g4 = 1} where d = gcd(m,n). 

(b) If m and n are relatively prime, show that p,, is an automorphism. 

(c) If G = (a) is cyclic, show that every automorphism of G arises as in (b). 


. Let a: G— G, be a group homomorphism with kera = K. For a € G, show that 


Ka={gEG| a(g) = a(a)}. 


. Ifa: G-— G;, is a group homomorphism and both a(G) and ker a are finitely gener- 


ated, show that G is finitely generated. 


. Find all group homomorphisms 


(a) Cs — Ks (b) C3 —. Ag (c) D3 _ C4 (d) Ag _ C3 


. If K=f{e, (1 2)(3 4),  3)(2 4), (1 4)(2 3)}, is there a group homomorphism 


a: S4— Ag, with kera = K? Support your answer. 


. Determine if there exists an onto group homomorphism in each case: 


(a)a:S33 K, (b)a:53-2C3 (c)a:53>C. (d)a: RC, 


. If G is a group, let 0: G— G x G be defined by 6(g) = (g, 9). 


(a) Show that @ is a one-to-one group homomorphism. 
(b) Show that the following conditions are equivalent: (1) G is abelian; (2) 6(G) is 
normal in G x G; and (3) 0(G) = ker for some homomorphism y: G x G—> G. 


. Show that a group G is simple if and only if every nontrivial group homomorphism 


G — G;, is one-to-one. 


. If Gis a simple group, show that there is a nontrivial group homomorphism G — G, 


if and only if G; has a subgroup isomorphic to G. 

If n is odd, show that there are at most 36 group homomorphisms D, — Aq. 

If |G| > 2 and aut G is cyclic, show that G is abelian and that aut G is finite and of 
even order. [Hint: Theorem 2 §2.9 and Theorem 5.] 


142 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


2A, 


25. 


26. 
27. 


28. 


29. 


30. 
31. 


32. 
33. 


34. 
35. 


2. Groups 


If aut G is simple, show that G is abelian or G/Z(G) is simple. [Hint: Exercise 16 
§2.8.] 


Let a:G— G, be a group homomorphism, as 

shown in the figure at the right. Ps 

(a) Show that a(G’)CG. [Hint Show Gray, 

a({a, b]) = [a(a), o(b)] for all a,b € G] 

(b) If p:G4G/G' and wy, : G, > Gi/G are o| | 
a 


the coset maps, show that a unique homo- 
morphism @:G/G' > G,/G4, exists such that 
ay = pia (see the diagram). 


a/c! 


Gi/Gi 


If G= Hx K and kK, = {(1,k) | k € K}, show that Ky dG, Ki = K and G/k, = 
A, 

Let G = GL,,(R), let K = {A | det A = 1} and let Ky = {A| det A= +1}. 

(a) Show K 1 Gand G/K = R’*. (b) Show Ki dG and G/K, =R*. 


Let G= { e JabeeRia#o,cx0}. if Ke { F i|leeR}. show that 
K<aGand G/K =R* x R’. 

Show that C*/C° & Rt, where C° = {z | |z| = 1} is the circle group. 

Show that R*/{1,-1} =Rt. 

If a,b€R, define t.,:R—R by 7o(%) =ar+b for all c ER. It follows that 
G = {ta | a,b € Rja # 0} is a subgroup of Sg. Show that K = {7,, |b € R} is a 
normal subgroup of G and G/K = R*. 

Consider Mo(Z) as a group under addition. For n > 2, show that M2(nZ) <1 M2(Z) 
and M2(Z)/M2(nZ) = M2(Z,)—~all additive groups. 

If G is abelian, let K = {(g,9,9) |g € G}. Show that K aGxGxGandGxGx 
G/K2&GxG. 

If G/K © H, show that there is an onto homomorphism a: G — H with kera = K. 
If a:G—G, is a group homomorphism and K <G with kera C K, show that 
a(K) a a(G) and a(G)/a(K) = G/K. 

Let G be a finite abelian group. Show that the following conditions are equivalent 
for an integer m: (1) g™ = 1 in G implies that g = 1; and (2) every element g EG 
has an mth root, that is, g = a” for some a € G. Compare your results with those of 


Exercise 16 §2.6. 
1a ob 


Let G= 0 1 clla,bcER}. 
001 
(a) Show that G is a subgroup of M3(R)* and that Z(G) & R. 


(b) Show that G/Z(G) = Rx R. 

Use the isomorphism theorem to show that, if m|n, then Z,/ (7m) = Zm. 

Let s=lcm(m,n). Show that Z, is isomorphic to a subgroup of Zm x Zn. [Hint: 
Think of Zp as Zm = Z/mZ.] 

Show that every infinite homomorphic image of Z is isomorphic to Z. 

Describe the homomorphic images of each group. 

(a) Zs (b) Asimple group G (c) Ag [Hint: Exercise 3 §2.8.] 

If |G| > 3, show that G has at least two automorphisms. [Hint: Theorem 5.] 

Let a: G— G, be an onto group homomorphism. If X is a subgroup of Gj, define 
a }(X) = {g € G| a(g) € X} as in Exercise 4. If X < G,; show that a-1(X) < G and 
G/ol(X) = G,/X. 


2.11. An Application to Binary Linear Codes 143 
36. If X and Y are additive abelian groups, let hom(X,Y) denote the set of all 
group homomorphisms a: X — Y. If a,@€ hom(X,Y), define a+8:X —Y by 
(a+ B)(z) = a(z) + A(z) for all x € X. 
(a) Show that hom(X, Y) is an abelian group under this addition. 
(b) Show that Y & hom(Z, Y) for every additive abelian group Y. 
(c) Show that hom(Z,,,Z,) = Za, where d = gcd(m, n). [Hint: If e = n/d, define ay : 
Zim —> Ly, bY Oy (%) = ke, where =c#+MZEZp,andi=zr+nZeE Zp. 
37. If G is a group and g,; € G for all i > 0, let [9:) = (90,91, 92,-*: ) denote an infinite 
sequence from G. Define [g;) = [hi) if and only if g; =A, for all i>0 and define 
[9:) - [hi) = [gihi). Write GY = {[g:) | 9: € Gh. 
(a) Show that GY is a group with the preceding multiplication. 
(b) Show that Go = {[9:) | go € G, 9; = 1 for all ¢ > 1} is a normal subgroup of G”, 
and GY’ /Go = G”. 
(c) Let F denote the set of mappings N > G and, if f,g € F, define fg € F by fg(i) = 
f(t) + g(i) for all 1 € N. Show that F is a group. What is the relationship between F' 
and G’? Support your answer. 
38. If K < G show that C(K) < G and G/[C(K)}] is isomorphic to a subgroup of aut K, 
where C(K) = {a € G | ak = ka for all k € K}. [Hint: Theorem 5.] 


2.11 AN APPLICATION TO BINARY LINEAR CODES 


The value of mathematics in any science lies more in disciplined analysis and abstract 
thinking than in particular theories or techniques. 


——Alan Tucker 


Coding theory is concerned with the transmission of information over a channel that 
is affected by noise. The noise causes errors, and the general aim is to detect such 
errors when they occur and to correct them if possible. Such codes are used every 
day in communication systems such as radio, television, and telephone; in data 
storage systems such as those used by banks; in the internal circuits of computers; 
and in many other systems where information is being processed. With the advent 
of computers, information is often expressed in digital form, that is as strings of Os 
and 1s which computers can easily handle. Consequently we deal with binary codes 
that are based on Ze = {0, 1}. 

General coding theory originated in the 1940s, primarily with the work of Claude 
Shannon. He created a mathematical theory of information and proved that certain 
codes exist which can transmit information at near optimal rates with arbitrarily 
small chance of error. In 1950, Richard W. Hamming discovered the error-detecting 
and error-correcting codes that now bear his name. Many of these codes are widely 
used today. 

Example 1 concretely illustrates many of the features of general coding. 


Example 1. Suppose that a spacecraft is orbiting the moon, and assume that the 
message 1 or 0 is to be sent instructing the mission commander to land or not. 


144 2. Groups 


Because of static interference (noise) the probability®® is 0.1 that an error will 
occur during transmission (and hence a probability of 0.9 that no error will occur). 
To ensure accuracy, the earth station transmits five signals: 11111 instead of 1 and 
00000 for 0. The spacecraft computer receives a five-digit message and decodes it by 
a simple majority: It concludes that 11111 was sent if more 1s than 0s are received 
and that 00000 was sent otherwise. For example, if it receives 11001 it concludes 
that 11111 was sent. Thus, the spacecraft computer will get the wrong message if 
and only if three or more errors occur in transmission and (assuming successive 
errors occur independently) the probability of this happening?” is 0.00856. This 
probability is less than 1%, even though there is a 10% chance of error on any one 
transmission. This decision method is called maximum likelihood decoding. 0 


Example 1 is a good illustration of the way coding works. A sender has a message 
to send (say, 1 in Example 1). It is encoded (as 11111) and transmitted over a noisy 
channel where it is received (as, say, 11001) and decoded (as 1) before being sent to 
the receiver. In Example 1, the coding process can detect errors and correct them 
with a probability of less than 0.01 of being wrong. 

In general, it is desirable to have more messages than 1 or 0 available for encod- 
ing and transmission. For convenience (and due to the ubiquity of computers) we 
assume that our messages, and the encoded messages to be transmitted, are strings 
of 0s and Is. We use the following notation. If n > 1, let 


B” = Ze x Ze X Zn X +++ X Dy 


denote the direct product of n copies of the (additive abelian) group Zp = {0, 1}. 
The elements of B” are called words of length n and, for convenience, we write 
them as strings of 0s and 1s rather than as n-tuples. Thus, 110101 in B® stands for 
(1,1, 0,1, 0,1). We call the individual 0s and 1s the bits of the word (an abbreviation 
for binary digits). A subset C of B”, with |C| > 2, is called an n-binary code (or 
simply an n-code). The words in C' are called code words. 

We describe the general coding process in the diagram. 


Message | Received Message | 


Word Word | Word | 


Encoding Transmission Decoding Retrieval 


A set of words, called message words, is given in B*. They are paired with a 
set C of longer words in B”, n > k, which will actually be transmitted. Thus C is 


36We treat probability informally here. The probability that an event occurs is the long-term 
proportion of the time that the event does indeed occur. Thus probabilities are numbers between 
0 and 1. A probability of 0 means that the event in question is impossible; a probability of 1 means 
that the event is certain to occur; and a probability of 0.5 means that the event is as likely as not 
to occur. 

*’The probability is computed as (§)(0.1)9(0.9)? + (8) (0.1)4(0.9)! + (8) (0.1)® = 0.00856. It is 
based on the assumption that at most one error occurs in each digit transmitted and that these 
errors occur independently. 


2.11. An Application to Binary Linear Codes 145 


an n-code, and the process of passing from a message to the corresponding code 
word is called encoding. Only code words are transmitted but, as some bits may 
be altered during transmission, words other than code words may be received. The 
sole purpose of the encoding process is to enable the receiver to detect errors and, if 
there are not too many, to correct them. The encoding and transmission processes 
are usually quite simple. The message words in B* are paired with code words in 
5B” in such a way that passing back’and forth is easy. A common method is te add 
extra bits (called check bits) to the end of the message so that the message itself 
forms the first k bits of the code word (making retrieval easy). The transmission 
process is more complex, and the design of codes that are easy and inexpensive 
to transmit (using, say, shift registers) is an important problem that we do not 
consider here. The most mathematically interesting part of the process is decoding. 
A method must be devised to detect bit errors in the received word and, hopefully, 
to correct them and so reconstruct the transmitted code word. The transmission and 
decoding part of the process begins and ends with code words, so we concentrate 
on constructing codes and pay less attention to encoding and retrieving. 

In Example 1, the 5-code {00000, 11111} has so few code words that a system 
(majority rule) of decoding can correct errors with a small probability of error. 
However, sometimes (for example, when retransmission is easy and inexpensive) all 
that is needed is to detect errors. Example 2 gives one such system that is commonly 
used. 


Example 2. Parity-check Codes are n-codes that are constructed as follows. 
The message words are the elements of B’~!, and we form the code words by 
adding one extra bit at the end, selecting it so that the total number of 1s is even 
(equivalently, the sum of the bits (in Zz) is 0). Such words are said to have even 
parity. Thus, the 4-parity-check code C is 


Message words (B%): 000 001 O10 011 100 101 110 111 
Code words (C): 0000 0011 0101 0110 1001 1010 1100 1111 


If a member of C is transmitted and one error occurs, the received word will have 
an odd number of 1s (odd parity) and so the error is detected. This code can thus 
detect any odd number of errors, but it cannot detect an even number of errors and 
it cannot correct any errors. Nonetheless, it is used in banking (the last digit of an 
account number is often a control digit) and in the internal arithmetic of digital 
computers. O 


Nearest Neighbor Decoding 


Many important error-correcting codes operate in the following way. A method is 
found to define the distance between two words in B”. Then a code C' C B” is 
found whose members are so far apart that, if any one bit (say) in a code word c 
is changed, the new word w is still closer to c than to any other word in the code. 
Thus, if c is transmitted and one error occurs, the received word w can be corrected 


146 2. Groups 


by replacing it with the code word closest to it. We state this more compactly as 
follows. ; 


Nearest Neighbor Decoding Let C be an n-code. If a word w is received, it is 
decoded as the code word in C closest to it. (If more than one candidate appears, 
choose arbitrarily*® ) 


Codes can be constructed that will correct any finite number of errors using nearest 
neighbor decoding. 

Of course the whole thing depends on the existence of an appropriate distance 
function on B”. If a word c is transmitted and ¢ errors occur, the received word w 
will differ from c in exactly ¢ bits. This is the distance between c and w. 

More precisely, let v and w be words in B”. The Hamming distance*? d(v, w) 
between uv and w is the number of coordinates at which their corresponding bits 
differ. Thus, if v = v,v2---Un and w = wyw2-++Wr, where the v; and w; are the 
bits, then d(v, w) is the number of indices 7 such that uv; # w;. Define the Hamming 
weight of w by wt w = d(w, 0). Thus, wt w is the number of 1s occurring as bits of 
the word w. 

The following theorem gives some fundamental properties of the Hamming 
weight and distance functions. The proof uses the fact that B” is an additive group 
under componentwise operations. Thus two words are added by adding correspond- 
ing bits modulo 2. For example, 


10101 + 11011 = 01110 in B. 


Note that the unity is the word 000---0, each of whose bits is 0, which we denote 
0. Also, —w = w for each word w in B”, but we write v — w for clarity. 


Theorem 1. Let u,v, and w be words in B”. 
(1) d(v, w) = wt(v — w). 
(2) d(v,w) = d(w,v). 
(3) d(v,w) = 0 if and only if v = w. 
(4) d(u,w) < d(u,v) + d(v, w). 


Proof. (1) A bit of v—w is a 1 if and only if v and w differ at that coordinate. 
Hence the number of bits of v — w that are 1s equals the number of coordinates 
where v and w differ. This is (1). 

(2), (3). We leave the proofs to the reader. 

(4) Write c=u-—v and y=v—w, so that r+y=u—w. Then, using (1), 
condition (4) becomes wt(z+y) <wtz+wty. Now let 2; and y; denote the 
ith bits of 2 and y, respectively. Then the wt(~+ y) is the number of values 
of i for which x;+y; = 1. Hence (4) certainly holds if x; + y; = 1 implies that 
a, =1 or y,=1. But this implication is clear because x; =0= y; implies that 
uty, = 0. | 


Properties (2), (3), and (4) of Theorem 1 justify calling d a distance func- 
tion on B”. The first two are clearly true of ordinary distance. With respect to 


38Tf it is feasible, retransmission may be called for in this case. 
39°The name honors Richard W. Hamming. Distance functions are also called metrics. 


2.11. An Application to Binary Linear Codes 147 


property (4), we may regard u,v, and w as é 
the vertices of a triangle (see the figure). 
Then (4) asserts that the length of one side 
of a triangle is not greater than the sum of w 
the lengths of the other two sides. For this 
reason we call (4) the triangle inequality. u 
This geometric terminology for Hamming distance is useful for discussing nearest 
neighbor decoding. Jf w is a word in B” and r > 0 is a real number, the set 


S,(w) = {v € B” | d(v,w) <r} 


is called the ball of radius r about w or simply the r-ball about w). We use this 
to describe how to construct a code C that can detect (or correct) ¢ errors. 

Suppose that a code word c is transmitted and a word w is received with s 
errors, where 1 <s <t. Then s is the number of coordinates at which the digits 
of c and w differ; that is, s = d(c,w). Hence S;(c) consists of all possible received 
words where at most t errors have occurred. We first assume that C' has the property 
that no code word lies in the t-ball of another code word. Because w € S;(c) and 
w +c, this means that w is not a code word and that the error has been detected. 
If we strengthen the assumption on C' to require that the t-balls about code words 
are pairwise disjoint, then w belongs to a unique ball (that about c), so w will be 
correctly decoded as c. 

To describe when this happens, let C be an n-code. The minimum distance 
d of C’ is defined to be the smallest distance between two distinct code words in C., 
That is, ° 

d= min{d(v, w) | v,w € Civ # uw}. 


Theorem 2. Let C be an n-code with minimum distance d. Assume that nearest 
neighbor decoding is used. 

(1) Ift-+1<d, then C can detect*® ¢ errors. 

(2) If2t+1<d, then C can correct t errors. 
Proof. (1) If c € C, the t-ball 5;(c) contains no other code word because t < d. 
Hence C can detect ¢ errors by the preceding discussion. 

(2) If 26+1<d, it suffices (by the preceding discussion) to show that the 

t-balls about distinct code words are pairwise disjoint. But if c#c' in C and 
w € S,(c’) N.S;(c), then the triangle inequality gives 


d(c,c') < d(c,w)+d(w,c) <t+t=2t<d 


by the hypothesis, a contradiction. | 


Example 3. The following 7-code has minimum distance 3, so it can detect 2 errors 
and correct 1 error. 


{0000000, 0101010, 1010101, 1110000, 1011010, 0100101, 0001111,1111111}. _B . 


“Tf C can detect (correct) t or fewer errors, we say simply that C detects (corrects) ¢ errors. 


148 2. Groups 


If ¢ is any word in B”, a word w satisfies d(w,c) =r if and only if w and c 
differ in exactly r bits. Hence there are exactly (”) such words w (where (") is the 
binomial coefficient), because there are (") ways to choose r bits of c to change. 
Therefore 


ISeCo)| = (6) + (2) +--+ (2): 
This leads to a useful bound on the size of error-correcting codes. 


Theorem 3. Hamming Bound. If an n-code C can correct t errors, then 


Qn 
Cia ae 
(et G) ee) 
Proof. Write N = (3) + (7) +--+ (7). The t-balls centered at distinct code words 
each contain N words, and there are |C| of them. Hence they contain N|C| distinct 
words (being pairwise disjoint). Hence N|C| < 2” because |B"| = 2”. This proves 
the theorem. a 


An n-code C is called perfect if there is equality in Theorem 3 or, equivalently, 
if every word in B” lies in exactly one t-ball about a code word. Such codes exist. 
For example, if n = 3 and t= 1, then (5) + G) = 4 and the Hamming bound is 
23/4 = 2. The 3-code C = {000,111} has minimum distance 3, so by Theorem 2 it 
can correct 1 error. Hence C’ is perfect. We present another example of a perfect 


code later. 


Binary Linear Codes and Coset Decoding 


Up to this point we have regarded any nonempty subset of B” as an n-code. How- 
ever, many important codes are subgroups. The group B” has order 2” so, by 
Lagrange’s theorem, each subgroup has order 2" for some k = 0,1,--- ,n. Given 
integers k and n, with 1<k <n, an additive subgroup C of B” of order 2* is 
called an (n,k)-binary linear code (or simply an (n,k)-code). Note that we do 
not regard the trivial subgroup (k = 0) as a code. 


Example 4. The code {00000, 11111} in Example 1 is a (5, 1)-code. 


Example 5. The n-parity-check codes in Example 2 are (n,n — 1)-codes, because 
the sum of two words of even parity also has even parity. 


Example 6. {0000, 0101, 1010, 1111} is a (4, 2)-code. The following is a (4, 3)-code: 
{0000, 0010, 0101, 0111, 1000, 1010, 1101, 1111}. 


Many of the properties of the general n-codes take a simpler form for linear 
codes. The first part of the next theorem gives a much easier way to find the 
minimum distance of a linear code, the second and third parts strengthen Theorem 
2, and the fourth part reformulates the Hamming bound. 


Theorem 4. Let C be an (n,k)-code with minimum distance d. 
(1) d=min{wtw|0#weCc}4} 
(2) C can detect t errors if and only ift +1 < d. 


“Because of this the minimum distance of a linear code is sometimes called the minimum weight 
of the code. 


2.11. An Application to Binary Linear Codes 149 


(3) C can correct t errors if and only if 2¢+1< d. 
(4) IfC can correct t errors, then (5) + ({) +--+ + (7) <2". 


Proof. (1) Write d' = min{wt w |0#w eC}. f04weéC, then wtw >d by the 
definition of d. Hence d’ > d. However, Theorem 1 gives d(v, w) = wt(v — w) for all 
v #w in C, so d(v,w) > d' because v — w € C (C is a group). Hence d > d’. 

(2) Assume that C can detect ¢ errors. If wé€ C, w#0, the t-ball about w 
contains no other code word (see the discussion preceding Theorem 2). In particular, 
it does not contain the code word 0, sot+1< d(w,0) = wtw. Hencet+1<d by 
(1). The converse is part of Theorem 2. 

(3) If C corrects ¢ errors, the t-balls about code words are pairwise disjoint (see 
the discussion preceding Theorem 2). It suffices to show that wtc > 2¢+ 1 for all 
c&€C,c#0, since then d > 2¢+1. 

So assume, on the contrary, that wt c < 2¢. We show that then 5;(0)NS;(c) # ©, 
a contradiction. Since c ¢ 5;(0), we have wt(c) >t, so c has more than t ones as 
bits. Form w by changing exactly t of these ones to zeros, and leaving the other bits 
of c as they were. Then d(w,c) =t, so w € 5S;(c). But c has at most 2¢ ones as digits 
(wt(c) < 2t), so w will have at most t ones. Hence wt(w) < t; that is d(w,0) <1; 
that is w € S;(0). So S;,(0) NS¢(c) # ®, as required. 

(4) Because |C| = 2", this assertion restates Theorem 3. i 


In practice, an (n, &)-code C’ contains a large number of words, so implementing 
nearest neighbor decoding by computing the distance between a received word and 
all 2* code words is impractical at best. Fortunately, methods exist for reducing 
the amount of work required. One of these methods, called coset decoding, is based 
on the fact that the group B” is partitioned into cosets by the subgroup C. In fact, 
there are 2"/ gk — 2"-* cosets w+ C, where w € B”™. The method depends on the 
following notion. 

In each coset of C in B”, choose a word e of minimum weight, called the coset 
leader for that coset. Note that there may be more than one candidate for coset 
leader. For example, if C is the code in Example 3 and w = 0111000, the coset 


w+ C = {0111000, 0010010, 1101101, 1001000, 
1100010, 0011101, 0110111, 1000111} 


has two members of minimum weight 2. 
After choosing the coset leaders, we can easily state the decoding procedure. 


Coset Decoding Let C be an (n,k)-code. If a word w € B” is received, and if e 
is any coset leader for w+ C, decode w as w — e. 


Theorem 5. Coset decoding is nearest neighbor decoding. 


Proof. Let C be an (n, k)-code. If a word w is received and e is any coset leader in 
w+C, then c = w —e is a code word in C (because e is in w+C). We must show 
that w is as close to c as any other element d of C. We havew-—-de€w+C=e+C, 
so wt e < wt(w — d) by the choice of e in C. Hence 

d(w,c) = wt(w —c) = wte < wt(w ~— d) =d(w,d), 


which is what we wanted. A 


150 2. Groups 
Example 7. Consider the (6, 3)-code: 
C = {000000, 001110, 010101, 011011, 100011, 101101, 110110, 111000}. 
If w = 101011 and v = 011100 are received, decode them using coset decoding. 
Solution. The cosets generated by w and v are 
w+C = {101011, 100101, 111110, 110000, 001000, 000110, 011101, 010011} 
uv + C = {011100, 010010, 001001, 000111, 111111, 110001, 101010, 100100} 


One of the coset leaders in w +C' is e = 001000, so w decodes as w — e = 100011. 
However, v-+C has three potential coset leaders: f = 010010, g = 001001, and 
h = 100100. These leaders decode v as 001110, 010101, and 111000, respectively. 
Note that C has minimum distance 3, so it will correct one error by Theorem 4. 
Since w is one error away from 100011 (in C), the code corrects w. But d(v,c) > 2 
for every word c in C, so the code does not correct v. Note that 001110, 010101, 
and 111000 are all the elements of C at distance 2 from v. O 


Given an (n,k)-code C for which |C| = 2* is not too large, we can carry out 
coset decoding by constructing a table (called a standard array for C), the rows 
of.which are the various cosets w+ C of C in B”. The coset C = 0+ C is listed in 
the top row with 0 in column 1. (Note that 0 is the coset a for C.) In general, if 
e is any coset leader for w+C, then w+ C =e+C;, and we place the elements of 
this coset in a row of the table with e in column 1 and e + c in the column headed 
by c for each c € C. We then decode as follows: If we receive a word w, we locate 
it in the table (so w = e+ c, where e is a coset leader) and decode it as the code 
word c at the head of its column. Here is an example. 


Example 8. Construct the standard array for the (4,2)-code C = {0000, 0110, 
1011, 1101}. 


Solution. We obtain the rows 


of this table as follows: The C=0+C [0000 0110 1011 1101 
first row lists the elements a+C 0100 0010 1111 1001 
of C in any order except eg +C 1000 1110 0011 0101 


that the coset leader 0 is in eg+C 0001 0111 1010 1100 


column. 1; to obtain the next 
row, choose any element of B* not in C, say 1111, and construct the coset 


1111 + C = {1111, 1001, 0100, 0010}. 


Next, we choose a coset leader, say, e: = 0100, (0010 would do as well), and obtain 
row 2 of the table by adding e; to the elements of row 1 in order. Thus, for example, 


2.11. An Application to Binary Linear Codes 151 


the word 1111 in column 3 is the sum of e; and the word 1011 (in C) at the head 
of column 3. 

We complete the rest of the table in the same way. To form any row, we choose 
an element of B* not yet listed, find a coset leader in its coset, and list the coset as 
a row. The remaining coset leaders are ez = 1000 and e3 = 0001 (each the unique 
word of minimum weight in its coset). 

With the table complete, decoding is easy. For example, if we receive w = 1010, 
we decode w as c = 1011 because w is in column 3 of the table. | 


This method is impractical for large linear codes. For example, a (40, 10)-code 
has 2°° > 10° cosets, so finding the coset leaders is practically impossible. Hence 
large codes are constructed using more systematic methods. 


Matrix Methods 


One convenient way to obtain codes is by using matrix multiplication. Here we 
take the original messages to be the elements of B*. We regard them as 1 x k row 
matrices with entries from Z2 and encode by multiplying by a fixed binary matrix 
(entries from Z2). We use the usual rules for matrix multiplication, except that we 
do arithmetic modulo 2. 


Example 9. The Hamming (7, 4)-code.*? We use the binary matrix 


ooorF 
oo Ff Oo 
Orono 
Be OO Oo 
Orr Fe 
apne tae 
Fr OF 


The message words are the elements of B*; for example, u = 1011 is encoded as 
uG = 1011001 because of the matrix product 


T0000 10 dy 2 
010011 0 
= a ae sil OD tel @ © Lh. 
uG =| liet 4 O10 4 | | 
00010171 


In the chart below, the code words corresponding to all entries of B* appear on the 
right. Here each nonzero code word has weight at least 3, so the code can detect 
two errors and correct one error by Theorem 4. 


42This code was the first nontrivial example of an error correcting code given in the groundbreaking 
paper in which information theory was originated (Shannon, C.E., A mathematical theory of 
communication, Bell Systems Technical Journal 27 (1948), 623-656). 


152 2. Groups 


Message Word Code Word 


0000 0000000 - 
0001 0001011 
0010 0010101 
0011 0011110 
0100 0100110 
0101 0101101 
0110 0110011 
0111 0111000 
1000 1000111 
1001 1001100 
1010 1010010 
1011 1011001 
1100 1100001 
1101 1101010 
1110 1110100 
1111 1121111 O 


Observe that the first four columns of the matrix G in Example 9 form the 4 x 4 
identity matrix I,. This ensures that the first four digits of each code word uG form 
the original message word. The general situation is described using the following 
terminology. 

An (n, k)-code C is called a systematic code if each message word in B* forms 
the first k digits of exactly one code word. A k x n matrix of the form*® 


G=[f, Al 
is a standard generator matrix if I, is the k x k identity matrix and A is a 
k x (n—k) binary matrix. Thus, the matrix G in Example 9 is a 4 x 7 standard 
generator matrix G = [I, A] where 


1 
0 
A= 
1 
1 


Orr 
“M_ Oo eR 


The code itself is given as C = {uG | u € B*}. 


Theorem 6. Let G be ak x n standard generator matrix. Then 


C = {uG | ue B*} 


431f A and B are k xX m and k x n matrices, the notation [A B] indicates the k x (m+n) matrix 
with A occupying the first m columns and B occupying the last n columns. The matrix [A 8B] is 
said to be given in block form. 


2.11. An Application to Binary Linear Codes 153 


is a systematic (n, k)-code. Conversely, every systematic (n, k)-code is given in this 
way by a standard generator matrix G. 


Proof. Define o : B® —+ B” by o(u) = uG for all u€ B*. Then o is a group ho- 
momorphism because matrix multiplication satisfies (u+v)G = uG+vG. As o is 
clearly onto C, this shows that C is a subgroup of B”. In fact, o is one-to-one. To 
see this, write G = [I, A], where A is k x (n—k). Then 

o(u) =ul, Al=[ul, uA] = fu uA], for all ue Be. 


Hence a is one-to-one because o(u) = o(v) implies that [u uA] =[v vA], when 
u =v. Thus B* and C are isomorphic groups and, in particular, |C| = |B*| = 2". 
Thus C is an (n,k)-code; it is systematic because o(u) = [u wA] for all u € B*. 
This proves the first part of Theorem 6; we leave the converse as Exercise 26. 


Example 10. The (6, 3)-code 
C = {000000, 001110, 010101, 011011, 100011, 101101, 110110, 111000} 


in Example 7 is systematic, and the reader can verify that it is generated by the 


standard generator matrix 
100011 
G=]0 10101]. 
001110 


That is, C = {uG | u € B}. Oo 


If C is a systematic (n, k)-code, we can easily write down a standard generator 
matrix for C’. Because C' is systematic, it contains a word c; whose first k digits 
form row i of J,. Let G be the k x n matrix whose rows are C1, C2,°*: , Cx in order: 


C1 


Ck 


Then G is a standard generator matrix and C = {uG | u € B*} (See Exercise 26). 
Incidentally, we say that C is generated by G when C = {uG | u € B*}. In this 
case C’ consists of 0 and all sums of (1 or more) of the generating words c1, ce, °+- , CK. 
This is illustrated in Example 11. 


Example 11. Both the codes {0000, 0101, 1010, 1111} and 
{0000, 0010, 0101, 0111, 1000, 1010, 1101, 1111} 


10 0 

‘ Bicccs : 1010 
in Example 6 are systematic with matrices lc 2 Jana : 1 0 
001 


On the other hand, the (7, 3)-code in Example 3 is not systematic. Nonetheless, 
every (n, k)-code is close to being systematic in the sense that it contains k words 
with the following property: If F is the k x n matrix with these words as rows, F 
contains every column of the k x k identity matrix J;, (Exercise 27). ; 


154 2. Groups 


The use of a standard generator matrix is a convenient method of generating 
(n, k)-codes not only because retrieval is easy but also because a k x n matrix has 
only kn entries to store, whereas the code contains 2* words of n entries each. More- 
over, the process of encoding with a systematic code is simple: Multiply the message 
word by the generator matrix. Hence it is not surprising that matrix methods give 
a simple way to detect and correct errors. 

To understand why, let C be a systematic binary (n,k)-code with standard 
generator matrix G = [I;, A], where Ais ak x (n —k) binary matrix. The parity- 
check matrix“* for C is the n x (n — k) matrix given in block form by 


If w is a word in B”, the word wH in B"-* is called the syndrome of w. Note 
that each of G and H completely determines the other, so either matrix determines 
the code C. 


Example 12. The Hamming (7, 4)-code in Example 9 has the generator matrix 


1000111 ae et 
BES AO BAO eae ae 
0010101 re ae 
0°00 Oa, 2 011 
1 3-4 
110 
101 
Hence the parity-check matrix is H=|0 1 1 J 
100 
010 
001 


In Example 12, the reader can verify that GH = 0 is the zero matrix. This 
relation holds in general. 


Lemma. If G and H are the standard generator matrix and the parity-check matriz 
of a systematic (n,k)-code, then GH = 0. 


Proof. Write G = (I, A] so that H = raf . Then block multiplication gives 


GH =[kk Ae 


ce ZAGAT = AAO 


where A+ A= 0 because A is binary and x + 2 =0 for all x € Zp. A 


44Systematic binary codes are often defined using the parity-check matrix. Then the transpose of 
ff is referred to as the parity-check matrix. 


2.11. An Application to Binary Linear Codes 155 


Theorem 7. Orthogonality Theorem. Let C be a systematic (n,k)-code with 
parity-check matrix H. 

(1) C={w € B" | wH = 0}. 

(2) Words w and v in B” lie in the same C-coset if and only if wH = vH. 


Proof. (1) Let G = [I A] be the generator matrix for C, so H = - || . Define 


a: B" — B-* by a(w) = wH for all w € B”. Then a is a group homomorphism 
because (w +v)H = wH +vH, and (1) amounts to showing that C = kera. We 
first verify that a is onto. If v € B"-*, let w= [0 v] € B” be the word whose first 
k bits are zero and which ends with v. Then 


a(w)=wH=(0 o][,* | =0A+ul 4 =0+0=0. 


Hence a is onto, so ima = B"-*, Now the isomorphism theorem (Theorem 4 §2.10) 
gives B"/(ker a) & B"-*, so |B"|/| ker a| = |B"~*|. Therefore | ker a| = 2* and so 
| ker a| = |C|. Then to prove that C' = ker a, it suffices to show that C C ker a. But 
ifc € C, then c = uG for some u € B* (Theorem 6), so a(c) = cH = uGH = u0 = 0 
by the lemma. Hence CC ker a. 

(2) For w and v in B”, we have a chain of equivalences 


wt+tC=v+C & w-veC © (w-v)H=0 © wH=vH 


where the first equivalence comes from Theorem 1 §2.6, the second is by (1), and 
the third is because (w — v)H = wH — vH. | 


The orthogonality theorem enables us to reformulate the coset decoding 
algorithm entirely in terms of the parity-check matrix. 


Syndrome Decoding Let C' be a systematic (n,k)-code with parity-check matrix 
H. If w € B” is received, compute its syndrome wH and find a word e € B” 
of minimal weight with the same syndrome (that is, wH = eH). Decode w as 
c=w-e. 


The advantage of this method is that it requires knowing only the syndromes of 
the coset leaders (rather than the entire standard array), and sometimes the coset 
leaders can be discovered without finding the whole array. 

Nearest neighbor decoding, as we have described it, is complete decoding in the 
sense that every received word is decoded. However, in many cases (especially where 
retransmission is easy) a better approach is to use a partial decoding procedure that 
corrects ¢ errors and calls for retransmission when more than t errors are detected. 
We conclude by describing one such algorithm. 

In this section, we have merely touched the surface of algebraic coding theory. 
For example, these results generalize with very little change if an (n,k)-code is 
defined to be a k-dimensional subspace of an n-dimensional vector space V over 
a finite field F (in our discussion, V = B”™ and F = Z). Even more sophisticated 


156 2. Groups 


coding algorithms exist that use ring theory and field theory as well as group theory 


and linear algebra (see Section 6.7 for one such application).*° 
Exercises 2.11 
1. Find the Hamming weight of each word. 
(a) 10110110 (b) 11010110 
(c) 00101011011 (d) 010110101011 
2. Find the Hamming distance between each pair of words. 
(a) 101101 and 010101 (b) 10110101 and 01110111 
(c) 1110111 and 0001000 (d) 10110111 and 01001011 


3. Show that d(v,w) = d(u+v,u+w) for all u,v, and w in B”. 

4, What is the maximum value of d(v, w) when v,w € B”? Describe the pairs of words 
v and w in B” with d(v, w) as large as possible. 

5. Let w be the word obtained from w € B” by changing every bit. 
(a) Show that 0+ =v-+w for all v,w € B”. 
(b) Show that d(v, w) + d(v,®) = 7 for all v,w € B”. 

6. Let C be the (7,3)-code in Example 3. Find the nearest neighbors to each of the 
following words in B’ and so correct them (if possible). 
(a) 0110101 (b) 0101110 (c) 1011001 (d) 1100110 

7. How many errors can be detected or corrected by each of the following codes? 
(a) C = {0000000, 0011110, 0100111, 0111001, 


1001011, 1010101, 1101100, 1110010} 
(b) C = {0000000000, 0010011111, 0101100111, 0111111000, 


1001110001, 1011101110, 1100010110, 1110001001} 
8. Let c be a word in B” and let 0 <t <n. Show that S;(c) = {uv +c|v € S;(0)}. 
9. (a) Show that the Hamming bound is equality if t = 1 in the (7,4)-Hamming code. 
(b) What is the maximum number of errors that an (8,3)-code can correct? 
(c) Is there a (7, 2)-code of minimum distance 5? 

10. (a) If a systematic (n,2)-code corrects one error, use the Hamming bound to show 
that n > 5 and find a (5, 2)-code that corrects one error. 

(b) If a systematic (n,2)-code corrects two errors, use the Hamming bound to show 
that n > 7. Show that no (7,2)-code can correct two errors. Is there an (8, 2)-code 
that corrects two errors? Justify your answer. 

11. (a) If an (n,3)-code corrects two errors, show that n > 9. 

(b) Find a (10,3)-code that corrects two errors. It can be shown that there is no 
(9,3)-code that corrects two errors. 

12. Given r > 2, write n = 2" —1 and k= 2" —r—1 so that n—k=r. Define H to be 
the n x r parity-check matrix consisting of all n = 2” —1 nonzero elements of B” 
with J, forming the last r rows. The corresponding (n, k)-code is called a Hamming 
code. (The (7,4)-Hamming code is the case r = 3). Show that every Hamming code 
corrects one error. 


#8 An introduction to the subject is given in Pless, V., Introduction to the Theory of Error Correct- 
ing Codes, New York: Wiley, 1982. A more thorough treatment (with an extensive bibliography) 
is that by MacWilliams, F.I., and Sloan, N.J.A., The Theory of Error Correcting Codes, Vols. I 
and II, New York: North Holland, 1977. Finally, a useful survey is contained in Chapter 4 of Lidl, 
R., and Pilz, G., Applied Abstract Algebra, New York: Springer-Verlag, 1984. 


13. 


14, 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


' 22. 


23. 


24. 


a5. 


26. 


2.11. An Application to Binary Linear Codes 157 


If a code word c is transmitted and w is received, show that coset decoding will 
correctly decode w if and only if w —c is a coset leader in w+ C. 

Suppose that an (n, k)-code C has the property that each word e € B”, with wte < t, 
is a coset leader in e + C. Show that C corrects ¢ errors by using coset decoding. 

(a) Show that no (4,2)-code can correct single errors. 

(b) Construct a (5,2)-code that can correct a single error. 

(a) Show that no (6,3)-code can correct two errors. 

(b) Construct a (6,3)-code that can correct a single error. 

(c) Show that no (7,3)-code can correct two errors. 

Given words v and w in B”, define their product uw to be the word whose ith digit 
is the product v,;w; in Zz, where v; and w; are the ith digits of v and w. 

(a) Show that wt(v + w) + 2wt(vw) = wtv+wtw. 

(b) Deduce the triangle inequality: wt(v + w) < wtu+wtw. (See Theorem 1.) 

(c) Show that equality holds in (b) if and only if the ith bit of w is 0 whenever the 
ith bit of v is 1. 

If v,w € B”, show that wt(v + w) > wtv — wtw with equality if and only if the ith 
bit of v is 1 whenever the th bit of w is 1. [Hint: Preceding exercise.| 

If C is an (n, k)-code, w € B” and w ¢ C, show that D = CU (w+ C) is an (n,k + 1)- 
code. 

Write down the standard generator matrix G and the parity-check matrix H for each 
of the following systematic codes. 

(a) C = {00000, 11111}. 

(b) C = any systematic (n, 1)-code. 

(c) The code in Exercise 7(a). 

(d) The code in Exercise 7(b). 

List the codes generated by each standard generator matrix. 


1011 10101041 
ot. oct Olea 
100101 10021011 
(c) }0 101 10 (d)}0 100010 
901001 0011100 


If C is the (n,n — 1)-parity-check code (Example 2), show that C' is systematic and 
describe the standard generating matrix G and the parity-check matrix H. 
(Requires matrix algebra) Prove Theorem 7(a) without using the isomorphism 
theorem by writing each w € B” such that wH =O as w= {[u v] where wu consists of 
the first k bits of w, and v is the last n — k bits of w. ; 

(Requires matrix algebra) Let C' and OC” be (n,k)-codes, with standard generator 
matrices G and G’, and parity-check matrices H and H", respectively. 

(a) Show that C = C’ if and only if G=G’. 

(b) Show that C = C" if and only if H = Hi’. 

Let C be an (n,k)-code. 

(a) Show that either each word in C’ has even weight or exactly half have even weight. 
(b) Show that either each word in C has nth bit 0 or-exactly half have nth bit 0. 
(c) Generalize. 

Show that every systematic (n, k)-code C is generated by a k x n standard generator 
matrix G; that is, C = {uG | u € B*}. [Hint: Let ci, c2,--+ , cy be the rows of I;,; that 
is, c; has the ith bit 1 and all other bits 0. If [c; cj] is the unique element of C with 
c; as its first k bits, take G = [J,, A], where the rows of A are c),cy,--: ,c,.J 


158 2. Groups 


27. (Requires linear algebra) Show that every (n, k)-code C contains k words ci, C2,++* Ck 
C1 


such that the k x n matrix K =| : | contains every column of the k x k identity 


Ck 
matrix Ij,. |Hint: Regard C' as a vector space over Ze and let {bi,--- ,b,} bea 
by 


basis. If B = : |, carry B to reduced row-echelon form B — R and let c; be 


bp 
row t of R.] 


Chapter 3 


Rings 


Algebra is the intellectual instrument which has been created for rendering clear the 
quantitative aspect of the world. 


—Alfred North Whitehead 


Mathematics takes us still further from what is human into the region of absolute 
necessity, to which not only the actual world, but every possible world must conform. 


—Bertrand Russell 


Two of the earliest sources of the theory of rings lie in geometry and number theory. 
The study of surfaces determined by polynomial equations involved the addition and 
multiplication of polynomials in several variables. In addition, attempts to extend 
the prime factorization theorem for integers led to consideration of sets of complex 
numbers that were closed under addition and multiplication. Both cases involve a 
commutative multiplication. David Hilbert, who coined the term ring, and Richard 
Dedekind began the abstraction of these systems. 

Earlier, in 1843, William Rowan Hamilton had introduced his quaternions. They 
are a noncommutative ring that contains the complex numbers, and he developed 
a calculus for them that he hoped would be useful in physics. At about the same 
time, Hermann Giinther Grassmann was studying rings obtained by introducing 
a multiplication in what would today be called a finite dimensional vector space. 
The study of these “hypercomplex numbers” culminated in 1909 in the structure 
theorems of Joseph Henry MacLagan Wedderburn, which mark the beginning of 
noncommutative ring theory. 

However, it was not until 1921 that Emmy Noether unified and simplified much 
of the work up to her time by applying “finiteness conditions” to rings. Her monu- 
mental work has, as B. L. van der Waerden observed, “had a profound effect on the 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


159 


160 3. Rings 


development of modern algebra.” In particular, in 1927 it motivated Emil Artin to 
prove a far reaching extension of Wedderburn’s theorem that influenced a whole 
generation of ring theorists. This result is presented in Chapter 11. 


3.1 EXAMPLES AND BASIC PROPERTIES 


The most commonly used algebraic systems are the sets Z, R, Q, and C of numbers, 
and they have two operations: they are closed under addition and multiplication. 
In this chapter we discuss such systems for which addition and multiplication satisfy 
many of the properties familiar from arithmetic. 

A set RB is called a ring if it has two binary operations, written as addition and 
multiplication, satisfying the following axioms for all a,b, and cin R. 


Rl a+b=b+a. 

R2 at+(b+c)=(a+b)+e. 

R3 An element 0 in R exists such that 0+a=a for alla. 

R4_ For each a in R an element —a in R exists such that a+ (—a) =0. 
R5 a(bc) = (ab)e. 

R6 An element 1 in R exists such that l-a=a=a-1 for alla. 

R7 a(b+c) =ab+ac and (b+ c)a = ba+ca. 


And R is called a commutative ring if, in addition, 
R8 ab = ba for all a and b in R. 


The first four axioms assert that a ring R is an additive abelian group. The 
additive unity 0 in axiom R3 is called the zero of R, and the additive inverse —a of 
a in axiom R4 is called the negative of the element a. Axioms R5 and R6 show that 
R is a multiplicative monoid, so the element 1, called the unity*® of R, is unique 
(Theorem 1 §2.1). Sometimes we write the zero and unity as Og and 1p if the ring 
must be emphasized. The two identities in axiom R7 are called the distributive 
laws, and are the only axioms that connect addition and multiplication. 

Several important examples satisfy all the axioms for a ring except possibly R6, 
the existence of a unity. We call them general rings*”’. However, nearly all the 
examples mentioned in this book have a unity. 


Example 1. Each of Z, R, Q, and C is a commutative ring; Zn, is a commutative 
ring for each n > 2 by Theorem 4 81.3. 


Example: 2. The set M2(IR) of all 2 x 2 matrices over R is a ring using matrix 
addition and multiplication (see Appendix B). Note that M2(R) is noncommutative. 


Indeed, if A=[} 5 ]and B =[ » [then AB = B but BA = 0. 
Example 3. The set R[a] of all polynomials with coefficients in R is a ring with 
the usual addition and multiplication. We discuss such rings in detail in Chapter 4. 


46Other commonly used terms are unit element and identity. 

47Many authors use the term ring even if there is no unity and employ the term ring with unity 
when a unity exists. Our terminology is gaining acceptance because many examples of interest do 
indeed have a unity. 


3.1. Examples and Basic Properties 161 


Usmample 4. If X is a nonempty set, let F(X,R) be the set of all real valued 
functions f:X — R. Then F(X, R) is a commutative ring using pointwise addition 
and multiplication: If f and g are in F(X, R), we define 


f+g:X—-R by (f+ 9)(x)=f(x)+ g(x), forallae X, 
f:g:X—R_ by (f-g)(x) =f(x)g(z), for allae X. 


The zero of F'(X,R) is the constant function 6: X — R given by 6(x) = 0 for all 
x € X; the negative of f € F(X,R) is —f : X +R, defined by (—f)(z) = —f(z) 
for all 2 € X; and the unity is the constant function 1: X — R defined by 1(x) = 1 
for all « € X. We leave the routine verification of the other axioms the reader. DO 


Example 5. If Ry, Ro,..., Rn are rings, we define componentwise operations on 
the cartesian product R, x Re x -:: xX Ry as follows: 


(11,72,++-5Tn) + ($1, 82,-++,8n) = (11 + 81,72 + 82,---57n + Sn) 
(71,72) +++,Tn) + (81, $2)-++53n) = (1181, 7282,---;Tn8n) 


Then R, x Ro x +--+: X Ry is aring, the direct product of the rings R,, Re,..., Rn, 
and it is commutative if and only if each R; is commutative. The additive group 
is just the direct product of the (additive) groups R;, the unity is (1,1,...,1), and 
the zero is (0,0,...,0). im 

In the ring R the property 0-r =0 for all r is important and highlights the 
unique multiplicative role played by 0. In fact, this property holds for every ring R. 
Because it involves the multiplication of R, and because 0 is the additive identity 
for R, it is not surprising that it is a consequence of the distributive laws. 


Theorem 1. If 0 is the zero of a ring R, then Or = 0 = 10 for every r € R. 


Proof. Given r € R, compute: Or + 0r = (0+0)r = Or = 0r+0. Hence, Or =0 
follows by adding —Or to both sides (in the additive group R). Similarly, r0 = 0.8 


If it happens that 1 = 0 in aring F then, for anyr€ R,r=r-l=r-0=0 by 
Theorem 1. Hence, R = {0} is the zero ring in Example 6. 


Example 6. The set R = {0} is a ring where 0+ 0 =0 and 0-0=0. It is called 
the zero ring and denoted R = 0. 


Theorem 1 allows us to define matrix rings over an arbitrary ring. 


Example 7. Ifn > 1, ann xn matrix over a ring R is an n x n-array 


@i1 12 Qin 

@21 a22 aon 
A= [aij] = 

Qnl G@n2 ‘*** ann 


where each a,j is an element of R called the (7,7)-entry of A. 


The set of all n x n matrices over R is denoted M,(R). 


162 8. Rings 


As for numerical matrices, we define equality, addition, and multiplication in M,,(R) 
as follows: If A = [a,;] and B = [b,;| are in M,(R), then — 


(1) A = Bif and only if a;; = 6; for all ¢ and j. 
(2) A+ B= [a;; + dj]. 


n 
(3) AB = [cj] where cj; = > ainbg; for all 4 and j. 
k=1 


These are called matrix operations. It is routine to verify that M,(R) is an 
additive abelian group, the zero being the zero matrix 0 = [0] with each entry zero, 
and the negative of A = [a,,;] is —A = [—a,,;]. The associative and distributive laws 
(axioms R5 and R7) follow from the corresponding properties of R (see Appendix 
B). The unity of M,(R) is the n x n identity matrix 


1 0 wee 0 
j= 
0 0 eee 1 


with ones on the main diagonal (upper left to lower right) and zeros elsewhere. 
Hence M,,(R) is a ring, called the nxn matrix ring over R. Note that, if n > 2, 
then M,,(R) is noncommutative for every ring R #0 (see Example 2). Thus, for 
example, M2(Z2) is a noncommutative ring with 16 elements. O 


- Because a ring R is an additive abelian group, the laws of exponents take on 
a different form: If n € Z and a € R, we write the n** “power” a” of a as the n™ 
“multiple” na of a in an additive group. The following expressions translate other 
facts about exponents to additive notation. 


0a =0 ad 

la=a aa 

(-l)ha=-a a) = qo! 

(n+ m)a=na+ma Get aha 

n(a+b) =na+nb (ab)" = ab” (if ab = ba) 
n(ma) = (nm)a (2™)" aan 


We use these formulas frequently without further comment. 
If 1p denotes the unity of R, we also write lp +1p=2,lre+ig+1R=83, and 
so on. More generally, we write 
k-Ilr=k, for all integers k 


when no confusion can result. This notation is consistent with our convention of 
writing Z, = {0,1,2,--- ,2—1}. Of course we are interested in how this “multipli- 
cation” by integers relates to the multiplication in R. As in Theorem 1, this depends 
on the distributive laws. 


Theorem 2. Let r and s be arbitrary elements of a ring R. 
(1) (-r)s =r(—s) = —(rs). 
(2) (—r)(-s) =rs. 


(3) (mr)(ns) = mn(rs) for all integers m and n. 


8.1. Examples and Basic Properties 163 


Proof. Theorem 1 gives (—r)s+rs=(-—r+r)s =0s =0=-—(rs)+rs. Hence, 
(—r)s = —(rs) by cancellation. Similarly, r(—s) = —(rs), proving (1). Now (1) 
gives (—r)(—s) = r[—(—s)] = rs, proving (2). Turning to (3), we begin with 


r(ns)=n(rs), forallneZ. (*) 
This holds for n = 0 by Theorem 1. If it holds for some n > 0, then 
r{(n+1)s] =r(ns +s) =r(ns)+rs =n(rs) +1(rs) = (n+1)rs. 
Hence, (*) holds for all n > 0 by induction. If n < 0, write n = —m, m > 0. Then 
r(ns) = r[—(ms)] = —[r(ms)] = —[m(rs)] = n(rs), 
which proves (*). We leave it to the reader to show that (mr)s = m(rs) holds for 


all m € Z, and that this equation and (*) imply (3). | 


If r and s are elements of a ring FR, their difference r — s is defined by 
r—s=r+(—s). 


Thus, the equation z + s=r in R has the unique solution =r — s. As for numbers, 
we say that r —s is the result of subtracting s from r. Theorem 2 then gives the 
following extensions of the distributive laws: 


a(b—c)=ab-—ac and (b—c)a = ba — ca. 


These expressions allow us to use the familiar properties of subtraction in any ring. 
If R is any ring, we define the characteristic of R, denoted char R, in terms of 
the order of 1g in the additive group (R,+) : 


charR=n if o(1g) = in the additive group (R,+), 
charR=0_ if o(1p) = oo in the additive group (R, +). 


If k is an integer, we write kR = 0 to mean that kr = 0 for each r € R. By Theorem 
2, this happens if and only if klp = 0 (verify), so we obtain 


Theorem 3. If R is a ring and char R = n then 
(1) If char R=n > 0, then kR = 0 if and only if n divides k. 
(2) If char R = 0, then kR = 0 if and only if k = 0. 


Example 8. Each of Z, R, Q, and C has characteristic 0. Given n > 2, the ring 
Zm, has characteristic n. 


The binomial theorem for real variables (Example 6 §1.1) has a wide ranging 
generalization that will be needed later. 


Theorem 4. Binomial Theorem. Let a and b be elements in a ring R which 
commute, that is ab = ba. Then, for each n > 0 


(a +8)" = (G)a" + (Tard + (Gar 7b? +--+ (Mabe + (RYE, 


where (7) = Cm denotes the binomial coefficient (see Section 1.1). 


164 3. Rings 


Proof. It holds if n = 0 because r° = 1 for all r € R, and it holds for n = 1 because 
(5) =1= (") for each n > O. If it holds for some n > 1, compute 


(a+ b)"+! = (a+ b)(a+b)” 
= (08) (a+ art (2+ (D0 
= arth + [(g) + G)] arb +--+ (C21) + (q)] ab” + oe 
= (Mpi)artt + (Mp amb +--+ (Myi)ab™ + (Tort 


using the Pascal identity (,,",) + (2) = ("f") for1 < k < n (Exercise 13 §1.1). This 
completes the induction, and so proves the binomial theorem. | 


Thus, for example, taking a = b = 1 and writing 1+ 1 = 2, we obtain 


gn — (") fis (2) ns (5) sero Ey + (°) in any ring. 


Subrings 


If R is aring, a subset S is called a subring of R if it is itself a ring with the same 
operations (including the same unity)** as R. Thus, a subring of R is an additive 
subgroup of R that contains the unity of R and is closed under multiplication. The 
subgroup test (Theorem 1 §2.3) then gives 


Theorem 5. Subring Test. A subset S of a ring R is a subring if and only if the 
following conditions are satisfied. 


(1) 0€ S and1eS. 
(2) Ifs€ S andt€S, then s +t, st, and —s are all in S. 


As S$ is nonempty by (1), note that (2) is equivalent to the following condition: If 
s€SandteéS, then st€ S ands—teS. 


Example 9. If i? = —1 in C, write Z(i) = {n+ mi € C | m,n € Z}. Then Z(i) is a 
subring of C by the subring test, called the ring of gaussian integers. 


Example 10. If R is any ring, let 


nime(E J={( 4 


Show that Tp(R) is a subring of Mo(R). This is called the ring of upper triangular 


a,b,c in R. 


‘matrices over R. 


Solution. Clearly, the 2 x 2 zero matrix and the 2 x 2 identity matrix are in T)(R). 
Given A =| " Jand B =|? *|in T2(R), it is enough, by the subring test, to 


48The term subring is sometimes used for a general ring contained in R (possibly with a unity 
different from that of R). However, we insist that S and R have the same unity. 


8.1. Examples and Basic. Properties 165 


observe that each of the following matrices is in T>(R): 


-A=|> =); A+B=|*?? seal AB=(* aE g 


0 ce+r cr 


The ring of upper triangular matrices will be referred to again. In general, subrings 
of M2(R) are a fertile source of interesting examples of rings. 


Example 11. The set of continuous functions R —- R is an important subring of 
F(R, R) (see Example 4). Closure under addition, multiplication, and negation are 
theorems of calculus. The differentiable functions are also a subring of F(R, R). 


Example 12. If R is a ring, the center Z7(R) of R is a subring of R, where 
Z(R)={zeER| zr =rz for allr € R}. 


Verification is left to the reader. Elements in Z7(R) are said to be central in R. O 


An element e in a ring R is called an idempotent if e? = e. Examples of 


idempotents include 0 and 1 in any ring R, (1,0) and (0,1) in RXR, E ; Jana 


f » jin M.2(R), and 3 and 4 in Zg. If e is any idempotent, so also is (1 ~ e), and 


e(1—e) =0= (1 —e)e. If R is a ring, the idempotents in R provide an important 
class of rings S contained in R, (using the operations of R) that may not be subrings 
(that is, they may have a different unity). 


Theorem 6. If e = e? in a ring R, write eRe = {ere | r € R}. Then eRe is a ring 
with unity e, and eRe = {a€ R| ea=a=ae}. 


Proof. We have 0 = e0e, ere + ese = e(r + s)e, and —ere = e(—r)e. Hence, eRe is 
an additive subgroup of R that is closed under multiplication. Finally, the fact that 
e? = e means that e = eee € eRe and e(ere) = ere = (ere)e. Thus ¢ is the unity of 
eRe, and eRe C {a € R| ae = a = ea}. The reverse inclusion is easy. | 


The rings eRe, e? =e € R, are called corners of the ring R. The next example 
explains the name. 


10 


Example 13. If e =|) ‘ 


|< R= Ma(R), then e? =e and eRe = [5 llr eR}. 


Units and Division Rings 


If R is any ring, an element wu in R is called a unit if u has a multiplicative 
inverse in R (denoted u~). Thus u(u7!) = 1 = (u!)u. The units in a ring R can 
be “canceled”. More precisely, if u is a unit in R and r,s € R then 


ur=usimpliesr=s - and ru = su implies r = s. 

by left and right multiplication by u7?. 
The set of all units in R, denoted R*, is a multiplicative group (Theorem 1 §2.2) 
called the group of units of the ring R. This terminology is consistent with the 


166 3. Rings 


notation M* for the units in any multiplicative monoid. For example, Z* = {1,-1}, 
and Z* = {k | gcd(k,n) = 1} by Theorem 5 §1.3. 


Example 14. Show that Z(i)* = {1,—1,2,—7i}, where Z(i) = {a + bi | a,b € Z} is 
the ring of gaussian integers (Example 9). 


Solution. Let u = a+ bi be a unit in Z(i). Because uu! = 1, taking absolute values 
gives |u| |u-1| = 1. Hence, |u| = 1 so a? + b? = |u|? = 1. As a and bare also integers, 
the only solutions are a = +1, b=0, ora =0,b=+1,sou=1,-1,1,or-7%. O 


Example 15. Let R be a commutative ring. If A is an n x n matrix in M,(R), 
the determinant det A of A is defined exactly as in M,,(IR), and satisfies 


det A det B=det AB for all A,B € M,(R). 
Then the usual linear algebra argument (when R = R) gives 
M,,(R)* = {A € M,(R) | det A € R*}. 
In addition, the adjugate adj A is defined, again exactly as in M,,(R), and we have 
If A € M,,(R) and det A € R*, then A! = (det A)~tadj A. 


We emphasize that R must be commutative. This is discussed in Appendix B. 
For example, if n= 2, A =? a and det A = ad — bc is a unit in R, then 


At = (det. 4)1/ “7 


a 


Jas can be readily verified directly. O 


If R is not the zero ring, the zero element 0 is never a unit since otherwise 
1 = 0-10 = 0 by Theorem 1. Hence, every unit in R must be nonzero, and the rings 
where the converse holds are very important. A ring R #0 is called a division 
ring*® if every nonzero element of R is a unit in R; that is, if R* = R~ {0}. A 
commutative division ring is called a field. 


Example 16. Q, R, and C are fields. If p is a prime, Zp is a field by Theorem 7 
§1.3. Note that Z is not a field. 


For now, we have no other examples of fields and no examples at all of noncommu- 
tative division rings. We have more to say about them in the next section. 

An element a in a ring R is said to be nilpotent (and is called a nilpotent 
element) if a* = 0 for some k > 1. Clearly, 0 is a nilpotent in every ring. Other 


examples of nilpotents include 2 and 4 in Zs, i j land i > [in M2(R), and 


i: : jin M2(Z2). The observation in Example 17 is often useful. 


49 Also called a skew field. 


8.1. Examples and Basic Properties 167 


Example 17. If a is a nilpotent in R, show that 1 —a@ and 1+ a are units. 


Solution. If a" =0 where n>1, then a* =O for all k>n by Theorem 1. 


Hence u=1+a+a*+a°+--- is a finite sum, and so is an element of R. 
Then (1 —a)u=1=u(1—a), as the reader can verify. Hence 1 —a is a unit. As 
(~—a)" = 0, 1+a is a unit too. O 


In elementary algebra it is proved that, if « © R, and |z| <1, the geometric 
series 1+ 2-+2?+2° +--+ converges for any real number z with |z| < 1 and equals 
(1 — x)~ in this case. In the solution to Example 17 we recognize that 1+a+a?+:-: 
makes sense in any ring R when a is a nilpotent, which then provides a formula for 
(1—a)~}. This argument makes sense for other power series provided the coefficients 
are in the ring R, saye” =1+2+427+32°+--- , assuming 2,3,... are all units. 


Example 18. If a ring R has no nonzero nilpotent elements, show that all idem- 
potents in R are central. 


Solution. If e? =e€ R and ré€ R, write a=er(1—e). Then a? =0 because 
(1 —e)e = 0, so a= 0 by hypothesis. It follows that er = ere. Similarly re = ere, 
so er = ere = er. See Exercises 22 and 23 for other such properties. O 


Ring lsomorphisms 


The concept of isomorphic rings is analogous to the corresponding notion for groups, 
and it is equally important. Two rings R and S are called isomorphic (written 
R= S) if there is a mapping a: R — S that satisfies the following conditions. 


(1) o is a bijection. 

(2) o(r +s) =o(r)+o(s) for all r and s in R. o preserves sums 

(3) o(rs) = o(r)-o(s) for allr and s in R. o preserves products 
Such a map o is called a ring isomorphism. An isomorphism R — R is called an 
automorphism of R. 

Conditions (1) and (2) in the definition show that a ring isomorphism 0: R > S 


is also an isomorphism of additive groups, so it also preserves zero, negatives, and 
Z-multiples (Theorem 1 §2.5). That is, for r € R: 


a(0) =0, o(-r) =—o(r), and = a(kr) = ko(r) for all k € Z. 
In addition, o preserves the unity; that is 
o(1pr) = 13. 


To see why, write o(1p) = e, and let s € S. Since a is onto, we have s = a(r) for 
some r € R, so 


se=oa(r)-o(1r) =o(r-1r) =o(r) =s. 
Similarly es = s, so e=1 5 is the unity of S. Moreover, conditions (2) and (3) 


above show that o preserves the addition and multiplication tables and hence, as 
for groups, isomorphic rings R and S are the same except for the notations used. 


168 8. Rings 


Example 19. If R is a ring and u € R* is a unit in R, define 
Oy.:R—>R_ by oy(r)=uru™ for allr eR. 


Then o, is an automorphism of the ring R called the inner automorphism de- 
termined by u. Indeed, o, preserves addition and multiplication because 
1 


u(r +s)u-? = uru++ usu7! and u(rs)u7! = uru7} usu7} 


for all r,s € R. The proof that o, is one-to-one and onto is left to the reader. O 


These inner automorphisms are important. For example, two matrices A and 
B in M,(R) are called similar if PAP-! = B for an invertible matrix P, that is 
B= op(A). This is a fundamental concept in linear algebra. 


a 


Example 20. Show that R= { ; 


a | a,beE R} is isomorphic to C. 
Solution. The reader can verify that R is a subring of M2(IR). Define o : R— C by 


—b F ; apo ‘ a 
o . 1 =a+bi. Then o is clearly onto; it is one-to-one because a + bi = a! + Dt 


in C means that a = a’ and b = 0’; and it preserves addition (verify). Finally, 


o{fs w] fe W]h=e [sere iow] 
= (aa! — bb’) + (ab! + ba’)i 
(a + bi)(a’ + 2) 


a —b a’ 0! 
ols clole 2] 


shows that o preserves multiplication. Hence, o is a ring isomorphism. O 


lI 


Example 21. Show that the rings R = { lf ‘| | a,be Zz} and Zg x Zg are not 


isomorphic as rings, even though they are isomorphic as additive groups. 


0 
a ring isomorphism. If s = o(r), then s? = o(r)? = o(0) = 0. But s? = 0 in Zp x Zg 
implies that s = 0, giving r = 0 because a is one-to-one. This is a contradiction, so 


Solution. The element r = & ed in R satisfies r? = 0. Suppose 0 : R > Zp x Zz is 


: : ‘ a b ; F 
no such isomorphism o can exist. However, the map i: ‘| + (a,b) is an isomor- 
phism of additive groups, as the reader can verify. 


One of the consequences of Lagrange’s theorem is that every group of prime 
order must be cyclic. We conclude with the analogue for rings. 


Theorem 7. If R #0 is a ring and |R| =p is a prime, then R & Z, is a field. 
Proof. Define 0: Zp > R by 0(k) = k1p. This is well defined; in fact 


k=minZ, © p\(k—-m) & (k-m)lgp=0 © klg=mipz in R, 


3.1. Examples and Basic Properties 169 


so 0 is well defined and one-to-one. Finally, 1x is a generator of (R, +) by Lagrange’s 
theorem (p is a prime), which shows that 6 is onto. Hence, @ is an isomorphism. 


Exercises 3.1 


Throughout these exercises R denotes a ring unless otherwise specified. 


1. 


14. 


15. 


In each case explain why R is not a ring. 

(a) R= {0,1,2,3,---}, operations of Z 

(b) R = 2Z, 

(c) R= the set of all mappings f :R— R; addition as in Example 4 but using 
composition as the multiplication 


. If R is a ring, define the opposite ring R° to be the set R with the same addition 


but with multiplication r-s = sr. Show that R°? is a ring. 


. In each case show that S is a subring of R. 


or{ed 
) s={[5 _|[obeR}, R= MalR) 


a 0 b 

os={|2 c | 
0 04a 
: 


(4) S$ = a,beR}, R= Ma(R) 


abodeRate=b+dl, R= M,(R) 


a,b,c,d € x| , R= M;(R) 


. If S and T are subrings of R, show that SOT is a subring. Is this the case for 


S4+T={s+t|seS,teT}? 


. If X is a nonempty subset of R, show that C(X) = {ce R| cx = ac for all x € X} 


is a subring of R (called the centralizer of X in R). 


. (a) If ab =0 in a division ring R, show that a=0 or b=0. 


(b) If a? = b? in a field, show that a= b ora= —b. 
Compute Z[Mo2(R)] for any ring R. 


. (a) Show that (a + b)(a — b) = a? — b? in aring R if and only if ab = ba. 


(b) Show that (a+b)? = a? + 2ab +b? in a ring R if and only if ab = ba. 


. Show that a+6=6+<a follows from the other ring axioms, where we assume that 


bothO+a=aanda+0=<a hold for all ain R. 


. (a) If ab+ba = 1 and a* =a in a ring, show that a? = 1. 


(b) If ab = a and ba = b in a ring, show that a? =a and b? =b. 


. Show that 0 is the only nilpotent in R if and only if a? = 0 implies a = 0. 
. Ifa#b in R satisfy a® = 6° and a*b = ba, show that a? +b? is not a unit. 
. If u,v, and u+v are all units in a ring R, show that u~!+ v7! is also a unit and 


give a formula for (u-' +71)? in terms of u,v, and (u+v)~?. [Hint: Compute 
u(u? + u74)v,] 

Given r and s in a ring R, show that 1+,Ts is a unit if and only if 1+ sr is a unit. 
[Hint: s(1+rs) = (1+ sr)s.] 

Show that the following conditions are equivalent for a general ring R. 

(1) R has a unity. 

(2) R has a right unity (re =r for all r) and Ra = 0, a € R, implies that a = 0. 

(3) R has a unique right unity. 


170 


16. 


17. 
18. 


19. 
20. 
21. 
22. 


23. 


24, 


25. 


26. 


27. 


28. 


29. 


30. 


31. 
32. 
33. 


34. 


3. Rings 


If 1p denotes the unity of a ring R, write Zip = {klp | k € Z}. 
(a) Show that Z1p is a subring of R contained in Z(R). 
(b) If char R = n, show that Zlp = Zy. : 
(c) If char R = 0, show that Z1p = Z. 
Describe the rings of characteristic 1. 
In each case, find the characteristic of the ring. 
(a) Zn X Zan (b) M2(Zn) (c) ZX Zn 
If u is a unit in R and char R < oo, show that charR = o(u) in (R,+). 
If ua = au, where wu is a unit and a is a nilpotent, show that u+a is a unit. 
(a) If e? =e in R, show that 1 — 2e is a unit, indeed self-inverse. 
(b) If 2 € R*, show that o : {e | e? =e} > {u| u? = 1} is a bijection if o(e) = 1 — 2e. 
(a) If e? = e, show that (1 —)re and er(1 —e) are nilpotents for all r € R. 
(b) If e? = e, show that e+ (1 — e)re and e + er(1 — e) are idempotents for all r € R. 
(c) If e? = e, show that 1+ (1—e)re and 1+ er(1—e) are units for allr € R. 
Show that the following are equivalent for an idempotent e? =e € R. 
(1) e is central. (2) ef = fe whenever f? =f. 
(3) ea = ae for every nilpotent a (4) eu = ue for every unit u 
[Hint: Exercise 22.] 
Consider the following conditions on R: (1) every unit is central, (2) every nilpotent 
is central, and (3) every idempotent is central. Show that (1) = (2) = (3). [Hint: 
Exercise 22. 
If r? =r for all r € R, show that R is commutative. [Hint: Use Example 18 and 
Exercise 23 to show that a? central for all a.] Remark: In fact, Jacobson’s theorem 
asserts that R is commutative if, for each r € R, some n > 2 exists with r™ =r. 
In each case show that ab = 1 in R implies that ba = 1. 
(a) R is finite. [Hint: If R = {ry,--- , rn} show that {bri,--- ,brnz} = RB] 
(b) Every idempotent in R is central. 
(a) If a = a™*” in a ring R, where n > 1, show that (a*)? =a!’ for some t. [Hint: 
grt as gene forall ka Oi)... 
(b) If R is finite, show that some power of each element is an idempotent. 
In each case find the units, the nilpotents, and the idempotents in R. 


(a)R=Z  (b)R=Zoy — (c) R= Ma (Za) (a) R=|§ R| 


oR 
ZX n £ 
re aa[t 2) =f 2] 
Show that R is a ring with the usual matrix operations and find the units, nilpo- 


tents and idempotents. 
b : F 7 : 
Show that b Z| is invertible in M2(R) if a and d—ca™1b are invertible in 


n,mMeEZ; x ex} where X is any abelian group. 


eee a b _ fa 0 1 p 
R. [Hint: Find p and q such that [? ane 4 F ap 


If m is odd, show that m is an idempotent in Zam. 

If a?” = a for all a € R, show that 2a = 0 for allac R. 

A ring R is called a boolean ring (after George Boole) if r? = r for all r € R. Show 
that every boolean ring R # 0 is commutative of characteristic 2. 

Let R={X|X CU} where U is a set. If XBY=(XNY)U(Y NX) and 
XY =X NY, show that R is a boolean ring (Exercise 33). 


8.2. Integral Domains and Fields 171 


RE RO 

35. Show that i a 2 E _ 

36. In each case show that the given rings are not isomorphic. 

(a) R and C (b) Q and R (c) Zand Q (d) Zg and Z4 x Ze 

37. If Rand R’ are rings anda : R — R’ is an onto mapping satisfying o(rs) = o(r) - a(s) 
for all r,s € R, show that o(1) =1. 

38. If R&S are rings, show that: (a) Z(R)=Z(S);_ (b) R* & S* (as groups). 

39. Let X be an additive abelian group. A group homomorphism a: X — X is called 
an endomorphism of X. Given another endomorphism § : X — X, define the sum 
a+B:X—>X by (a+ 6)(z) =a(z)+ B(x) for all c€ X. Show that the set 
end X of all endomorphisms of X is a ring using this addition, and using composition 
as the multiplication. 

40. Write M2(R) = S. Find an idempotent e? = e in S such that R & eSe. 

41. If e? =e ER, show that o: (eRe)* > R* is a one-to-one group homomorphism 
where o(a) = a+(1—e) for all a € (eRe)*. 

42. (a) Show that k € Z, is nilpotent if and only if all prime divisors of n divide k. 
(b) If n = ab where ged(a,b) = 1, and if 1 = 2za+yb where z,y € Z, show that za 
is an idempotent in Zp. 
(c) Show that every idempotent in Z,, arises as in (b). [Hint: Exercise 35 §1.2.] 

43. Show that there are four nonisomorphic rings of order 4, isomorphic to one of Za, 


Zz X Zo, L= { ie | Jas € Za} , and a field (see Section 4.3). 


|r any ring R. 


0 


3.2. INTEGRAL DOMAINS AND FIELDS 


We have shown that a-0 =0=0-a holds for every element a in any ring. One of 
the most useful properties of the ring R of real numbers is that the only way that 
a product can equal zero is if one of the factors is zero; that is, ab = 0 in R implies 
that a =0 or b=0. This property is a fundamental tool for solving equations. 
For example, the usual method for solving the quadratic equation x? — x — 12 = 0, 
z ER, is first to factor it as (« — 4)(2+ 3) =0 and then conclude that  -4=0 
or £+3=0; that is, = 4 or x = —3. In this section we investigate rings in which 
ab = 0 implies that a = 0 or b = 0. The next theorem identifies two other equivalent 
conditions. 


Theorem 1. The following conditions are equivalent for a ring R. 

(1) Ifab =0 in R, then either a =0 or b=0. 

(2) Ifab=ac in R anda #0, then b= ce. 

(3) Ifba = ca in R anda #0, thenb=c. 
Proof. (1) = (2). Given (1), let ab=ac, where a #0. Then ab—ac=0, so 
a(b —c) = 0. As a $0, (1) implies that b — c = 0; that is, b=. 

(2) = (1). Assume (2) and let ab = 0 in R. If a = 0, there is nothing to prove. 
If a #0, the fact that ab = a0 (= 0) gives b = 0-by (2). 

The proof that (1) < (8) is analogous. | 


172 3. Rings 


A ring R £0 is called a domain ifthe conditions in Theorem 1 are satisfied. A 
commutative domain is called an integral domain. 


Example 1. Z is an integral domain—hence the name. 


Example 2. Show that every division ring is a domain, and hence that every field 
is an integral domain. 


Solution. Let ab = 0 in the division ring R; we must show that a = 0 or b= 0. But 
if a # 0, then a~! exists by hypothesis, so b = 1b = a~'ab = a~!0 = 0. Thus, Risa 
domain. O 


Example 3. Show that every subring of a division ring (a field) is a domain 
(integral). 


Solution. Let R be a subring of a division ring D. If ab =0 in R, then ab=0 in 
D too, soa=0 or b=0 by Example 2. Thus, R is a domain. If D is a field, R is 
commutative and so is an integral domain. Oo 


Thus, the ring Z(i) = {m+ni| m,n € Z} of gaussian integers is an integral 
domain. In fact many interesting examples of fields and integral domains arise as 
subrings of C. Here is an example that is actually a field. 


Example 4. Write Q(/2) = {r + sV2 | r,s € Q}. Show that Q(V2) is a field. 


Solution. Verifying that Q(./2) is a subring of R is easy. To verify that it is a 
field, it is convenient to introduce the following notions: By analogy with C, given 
a=r-+syV/2 in Q(v2), define its conjugate a* and norm N(a) by 


a®*=r—s/2 and N(a) =r? —2s?. 
Observe that N(a) = aa*. Suppose now that a #0 in Q(V2). Ifa =r+sV2 then 
s #0 because /2 ¢ Q (Example 3 §0.1). Hence N(a) = r? — 2s? #0 in Q. But then 
Nia) € Q, so the fact that aa* = N(a) implies that a7! = Nia) a* exists in Q(/2). 
Hence, Q(/2) is a field. 0 


The analogy between Q(/2) and C goes further: It is not difficult to verify that 
(ab)* = a*b* holds for all a and b in Q(./2), and hence that N(ab) = N(a)N(b). 
Some consequences of this are explored in Exercise 21. 

The ring Q(/2) in Example 4 is the result of adjoining an element /2 (not in 
Q) to the field Q. In this case everything is going on inside R, and the resulting ring 
is a subring of R. Similarly, the gaussian integers Z(z) are the result of adjoining 7 
to Z inside C. This adjoining process works more generally. 

For example, if R is any ring, we write R(w) to denote all formal sums r + sw, 
where r and s arein R: 


Rw) = {r+ sw |r,s € R}. 
As in C, we decree that . 
rtsw=r'+s'w ifandonly if r=r' ands=s', 


and we insist that 
w2E€R and rw=wr, forallre R. 


Then the ring axioms determine the addition and multiplication in R(w). Taking 
w =i, we obtain R(t) =C, and Z(z) is the ring of gaussian integers as before. 


- 3.2. Integral Domains and Fields 173 


We investigate this construction, and others like it, in Section 5.2. For the present, 
_ we use it informally to construct a field of nine elements. 


Example 5. Show that Z3(w) is a field with nine elements if w? = —1. 


Solution. Write Zz = {0, 1,2}. Then 
Z3(w) = {0, 1, 2, w, Qu, 1+w, 1+2Qw, 2+w, 2+ Qh. 


For a=r + sw in Z3(w), write a* =r — sw, so that aa*=r? + s? € Z3. Ifa + 0, then 
r #0 or s #0 holds in Z3—by our definition of R(w). This means that r? + s? 4 0 
in Zs (in fact, r? = 0 or 1 for all r in Zg). Write b = (r? + s*)~1a*. Then ab = 1 = ba 
so = G77, O 


We now turn to other properties of domains. 
Theorem 2. The characteristic of any domain is either zero or a prime. 


Proof. Let R be a domain, suppose char R + 0, say charR=n> 0. If n is not a 
prime, let n= km, where 1<k<nandl<m<vn. If 1 is the unity of R, then 
Theorem 2 §3.1 gives (k1)(m1) = (km)(1-1) = nl =0. Hence, kl = 0 or ml =0 
because R is a domain, a contradiction because n = o(1). Hence, n is a prime. @ 


Because char Z,, =n for each n > 2, Theorem 2 shows that Z, is an integral 
domain if and only if n is a prime, that is, if and only if Z, is a field. This also 
follows from Theorem 3. 


Theorem 3. Every finite integral domain is a field. 


Proof. Let R be a finite integral domain, say |R| = n, and write R = {ri,r2,...,Tn}- 
Given a # 0 in R, the set aR = {ari, are,..., arn} has distinct elements (ar; = ar; 
implies r; =r; by Theorem 1). Hence |aR| =n so, since aR C R and |R| = n, we 
have ak = R. In particular, 1 € aR, say 1 = ab, b € R. Because R is commutative, 
this shows that a is a unit. Hence, R is a field. i] 


A similar argument shows that every finite domain is a division ring (Exercise 
23). The reason for only considering the commutative case is a remarkable theorem 
first proved in 1905 by J.H.M. Wedderburn. We prove it in Section 10.4. 


Wedderburn’s Theorem. Every finite division ring is a field. 


This theorem seems to indicate that noncommutative division rings are rare. 
However, an example called the quaternions has been known since 1843. 


Quaternions 


In the early part of the nineteenth century, the importance of the complex num- 
bers was becoming increasingly apparent. The Irish mathematician William Rowan 
Hamilton gave the first modern exposition of the complex numbers in 1833. The set 
of complex numbers can be identified with the points in the plane, and Hamilton 
was looking for an analogous algebra to describe three-dimensional space. After 
a frustrating 10-year search, he finally realized that the algebra he sought must 
be four-dimensional and that the commutative law must fail. He called these new 


174 8. Rings 


“numbers” quaternions and subsequently devoted a great deal of time to them. 
However, their use has been limited by the great success of vector analysis. 

Complex numbers have the form a+ bi where a and b are real and i? = —1. By 
analogy, the set H of quaternions is defined by 


H = {a+ bi+cj+dk | a,b,c,d in R}. 
Here, as for complex numbers, we require that 
atbitejtdk=a'+bitejt+dk = a=a',b=0,c=c,andd=d'. 


We also insist that each r€R commutes with each of i,j, and k. With this, 
the multiplication in H is determined by the distributive laws once the products 
17,47, 4j,... are specified. These in turn follow from the equations 


42 = j* = k2 = ijk a —1,5! 


which yield the following formulas: 

i 
ij=h=-ji “ 
jk=t=—-kj 

: : k j 
= eee 


These formulas are best remembered from the diagram: The product of any two 
of i,7 and k taken clockwise around the circle is the next one, while the product 
counterclockwise is the negative of the next one.>? 

The fact that H is associative can be either verified directly, or by noting that 
there is a concrete realization of H as a subring of the ring M2(C) of 2 x 2 matrices 
over C (Exercise 31). The ring C of complex numbers is regarded as a subring of H 
by identifying a+ bi = a+ bi+07+ 0k. 

The following example illustrates how products in H are computed. 


Example 6. (3 — 4j)(2i + k) =6i + 3k — 871 — 4jk=6i + 3k + 8k — 4¢=27 4 11k. 
Example 7. Show that R is the center of H. 


Solution. If a € R then ag = qa for all g € H because a commutes with 7,7, and k. 
Conversely, let ¢g = a+ bi+ cj + dk lie in Z(H). Then the fact that gi = iq gives 
—b+ai+dj—ck=-—b+ai-—dj+ck. Equating coefficients gives c=0=d, so 
q=a-+bi. But then gj = jg implies b = 0, so g=a€E R, as required. im 


If z = a+ bi is a complex number, we have Z = a — bi and |z|/? = a? + b?. The 
analogy between C and HI leads to a natural extension of these important notions 
to H. Given a quaternion g=a+bi+cj+dk, define the conjugate q* and the 
norm NV(q) as follows: 


gq’ =a—-bi-—cj—dk and VN@)=04+P 4248. 


*°Tn other words, HI is a four-dimensional vector space over R with basis {1, i, 7, k}. 

51These equations first occurred to Hamilton while he was out walking, and he was so impressed 
with their importance that he carved the symbols with a knife on Brougham Bridge in Dublin. 
The date was October 6, 1843. 

52TIn Section 2.8, the set Q = {+1, +i, +7, +k} was called the quarternion group. 


8.2. Integral Domains and Fields 175 


A routine calculation establishes the following fact: 
qq* = N(q) = aq for every quaternion q. 
With this we prove 


Theorem 4. The ring H is a noncommutative division ring. Moreover, if q # 0 in 
H, then q™! = TOKE 


Proof. H is noncommutative (for example, ij # ji). Ifg=at+bit+cj+dk#0 in 
HH, then one of a,b,c, or d is nonzero, so N(q) = =a?+6?+¢? +a #0 in R. Since 
N(q ) € R is central in H, the equations gq* = N(q) = q*q give q™} = Not: | 


We mention one more fact about H. It is not difficult to verify (Exercise 30) 
that the norm is multiplicative in the sense that 


N (pq) = N(p)N(q), for all p and g in H. 


This formula shows that the product N(p)-.N(q) of two sums of four squares can 
itself be written as a sum of four squares. This is Lagrange’s famous four square 
identity. The analogue for two squares is also true, and is a consequence of the 
fact that |zw|? = |z|?|w|? for any complex numbers z and w. 


Field of Quotients 


By Example 3, every subring of a field is an integral domain. The converse also 
holds: Every integral domain R is isomorphic to a subring of a field F’ (we say 
R is embedded in F'). The prototype example is Z, where we regard ZC Q by 
identifying the integer n with the fraction }. More generally, if R is any integral 
domain, we construct a field Q of all fractions or quotients = from R and show that 
R can be identified with a subring of Q. 

The fact that, for example, 3 and ae are equal fractions must seem mysteri- 
ous when it is first encountered in school, and some i pupils probably are not too 
enlightened when the tne points out that 2 = ae The reason, of course, is 
that a fraction such as ¢ 3 represents a whole ash of pairs of integers (m,n), where 
= 3 . This representation suggests that an equivalence relation is at work. Our 
jumping-off point is the observation that 7 = oa in Q if and only if mn! = mn. 

This last equation makes sense in any integral domain R, and we use it to 
construct quotients £ from R as equivalence classes. First, we let 


X={(r,u)|reR, we R, uF 0} 
and define a relation = on X by 
(r,u) = (s,v), ifand only if rv=su. 
We claim that this is an equivalence on X. Clearly, (r,u) = (r,u) for all (r,u) in X, 
and (r, u) = (s,v) implies that (s,v) = (r,u). To prove transitivity, let 
(r,u)=(s,v) and (s,v) = (t,w). 


Then rv=su and sw =tv, so (rw)v = (rv)w = (su)w = u(sw) = utv (as BR is 
commutative), We have v # 0 because (s,v) € X, so we may cancel v in the domain 
R to obtain rw = tu; that is, 

(r,u) = (,w). 


176 3. Rings 


Thus = is an equivalence on X. 
Motivated by the case R = Z, we define the quotient = to be the equivalence 
class [(r,u)] of the pair (r,u) in X. More precisely, we write 


= = ((r,u)]. 


Now we invoke Theorem 1 §0.4 that [(r,u)] = [(s,v)] if and only if (r,u) = (s,v). 
In our quotient notation, this extends the familiar fact about rational fractions: 
r 


8 

-=- ifand only if ru=su. * 
rae y (*) 
Moreover, this condition implies another useful property of rational fractions: 


ra for allu#0in RB. ) 
VU 


U 


So we have created the quotients we wanted. 
Now let Q denote the set of all these quotients; that is, 


Q={=|ruin Randu4o}. 


Our objective is to make @ into a field. Once again motivated by Q, we define 
addition and multiplication in Q by 


res mis oy ters 
uv uu uu uv 
where rv + su, rs and uv on the right-hand side of each equation are computed in 
R. Note that uv #4 0O—R is a domain, so these are legitimate quotients in Q. 
Because these quotients are equivalence classes, we must show that addition 
and mie pleas are well defined by these formulas. We do it for addition: If 
= = and $= s, we must show that rutsu a rubeu We have ru’ = ru and 
sv’ = s'v by (*), and we must show that uv(r’v’ + s'u’) = u'u'(rv + su), again by 


(*). Compute 
uu(r'o! + s'u’) = (r’u)vuv! + (s'v)uu! = (ru’)ov' + (sv')uu! = u'o! (ru + su) 


as required. The verification that multiplication is well defined is left to the reader. 
With this, we can show that Q really is a field. Most of this will also be left to 
the reader; we verify the associative law of addition: 


s+ (Go$) =e (a) =e 


vw U vw u(vw) 
_ (rutsujwt+t(uv) — fru+su es t 
~ (uv)w 7 uv w 


rs t 
= (= +") +=. 
uv w 
Similar calculations show that Q is a commutative ring where, if u #0 in R, the 
zero is $ = Q, the unity is ; = 7, and the negative of F is =". Moreover, if © is 
nonzero in Q, then r #0, so % € Q. Then (**) gives 


1 
pita the unity of Q. 
ur ru i 
Hence (7)~' = 2, and we have proved that Q is a field. 


8.2. Integral Domains and Fields 177 


Finally, we can easily verify that R’={{[|r¢€R} is a subring of Q. Let 
a:R—R' be defined by o(r) = for all r¢ R. Then oa is clearly onto, and 


it is one-to-one because | = } implies that r=s by (*). Moreover, o is a ring 
isomorphism because 

rs rts ae ae 9 

bens and er 

eines ee SR a 


Hence R = R’. Customary practice is to identify R= R’ by taking r = ¢ for all 
r € R (as in ZC Q), and so to regard R as an actual subring of Q. 


Theorem 5. Embedding Theorem. If R is an integral domain, there is a field 
Q consisting of quotients =, where r and u #0 are elements of R. By identifying 
r = ¢ for allr € R we may (and do) regard R as a subring of Q. In that case, every 
u#0 in R has an inverse in Q, and each quotient in Q has the form ~ = rut, 


where r and u# 0 are in R. 


Proof. Only the last sentence remains to be proved. We have 


eee lor ar 
u 1wu 1 \i 


which becomes ru7' if we identify r = 7 for all r € R. a 


The field Q constructed in Theorem 5 is called the field of quotients of the integral 
domain R. 

The construction of the field Q of quotients of an integral domain R depends 
heavily on the fact that R is commutative. This dependence is in fact essential, 
because there exist noncommutative domains that cannot be embedded in a division 
ring. The first such example was discovered in 1937 by the Russian mathematician 
Anatoly Ivanovich Mal’cev.®? On the other hand, a wide class of noncommutative 
domains can be embedded in a division ring of right quotients. These are called 
right Ore-domains, after Oystein Ore who first discussed them in 1931. 


Exercises 3.2 


Throughout these exercises R denotes a ring unless otherwise specified. 


1. Find all the roots of 2? + 3x —4 in 
(a) Z (b) Ze (c) Zs 

. If p is a prime, let Zip) = {2 €Q|p does not divide m}. Show that this is an 
integral domain and find all the units. ‘ 

. Determine all idempotents and nilpotents in a domain. 

. Is Rx S ever a domain? Support your answer. 

. Show that M,,(R) is never a domain if n > 2. 

. If a2=b? and a’=b° in a domain, show that a=b. Now do it if a” =b™ and 


a” = b” where gcd(m,n) = 1. [Hint: 1 = m+ yn, where z,y € Z.] 


N 


Oa RR © 


53Mal’cev, A.I., Groups and other algebraic systems, in Mathematics: Its Contents and Mean- 
ing, Vol. 3, Cambridge MA: MIT Press, 1963. 


178 


17. 
18. 


19. 


20. 


21. 


22. 


23. 
24, 


25. 


26. 


3. Rings 


Suppose that R has no nonzero nilpotent elements (for example, a domain). If 
ab = 0 in R, show that ba = 0 

Show that a ring & is a division ring if and only if, for each nonzero a € R, there is 
a unique element b € R such that aba = a. 

Find a finite field in which a? +b? = 0 implies that a = 6 =0, and find another in 
which this is not true. 


. If F = {0,1,a, 5} is a field, fill in the addition and multiplication tables for F. 

. If F is a field and |F| = q, show that a! = a for alla € F. (Hint: Lagrange.] 

. Show that the characteristic of a finite field must be a prime. 

. If F is a field and |F'| = p, where p is a prime, show that F = Z,. 

. Show that there is no field of order 6. [Hint: Lagrange’s theorem] 

. Show that the center of a division ring is a field. 

. Let K be a subring of a field F. Call K a subfield of F if it is a field using the 


operations of F’. 

(a) Show that K is a subfield of F if and only if 0 #.a € K implies that a! € K. 
(b) If |F| = 8 and K is a subfield, show that K = F or K = {0,1}. [Hint: Lagrange.] 
(c) What happens if |F'| = 16 in (b)? 

Show that Q(z) = {r + si |7r,s € Q} is a subfield of C. 

(a) Show that Q(/5i) = {r + sV5i | r,s € Q} is a subfield of C. [Hint: Byanblé 4,] 
(b) Show that Z(/5i) = {n + my5i | n,m € Z} is a subring of C and find the units. 
[Hint: Example 14 §3.1.] 

Show that Q(V2) is the smallest subfield of R that contains /2. 

Show that Z(/2) = {n+my2|n,m € Z} is a subring of C and find 10 units (in 
fact there are oer many). [Hint: Example 4.] 

Let w EC satisfy w? € Z, but w¢Q, and define Z( ae {n+mw |n,m € Z}. If 
r=n-+mwu is in Z(w) write r* =n — mw and N(r) =n? — w?m?. Show that: 

(a) Z(w) is an integral domain. 

(b) me +m'w in Z(w) if and only ifn =n! and m =m’. 

(c) r** =71, (rs)* = r*s* and (pr +qs)* = pr* + qs* for all p,q €Z and r,s € Zw). 
(d) N(r) =rr* and N(rs) = N(r)N(s) for all r,s € Z(w). 

(e) r € Z(w) is a unit if and only if N(r) = 

If R is a ring, show that R is an integral domain if and only if it satisfies the 
condition: ab = ca, a # 0, implies that b=. 

Show that a finite domain is a division ring (a field by Wedderburn’s theorem). 
Recall that the binomial coefficient is defined by (") = q@4y forO<r<n. 

(a) If p is a prime, show that p| (7) for 1 <r<p-—1l. [Hint: For 1<r<n-1, show 
that (7) = #05)! 

(b) If ab = ba in a ring of characteristic p, show that (a + b)? = a? + 0°. 

(c) Let F' be a finite field of characteristic p (p a prime). If 0: F — F is defined by 
o(a) = a?, show that o is an automorphism of F (the Frobenius automorphism). 
Let R be an integral domain and let @ D R be the field of quotients. Ifo: R—-R 
is an automorphism, show that there is a unique automorphism o:@-- Q that 
satisfies G(r) = o(r) for all r € R. 

Show that the multiplication in (the construction of) the field of quotients of an 
integral domain: 

(a) is well defined; (b) is associative; (c) satisfies the distributive laws. 


27. 


28. 


29, 


30. 


31. 


32. 


/ 8.2. Integral Domains and Fields 179 


If R is an integral domain, show that the field of quotients Q in Theorem 5 is the 
smallest field containing R in the following sense: If R C F, where F is a field, show 
that F has a subfield K such that RC K and K =QqQ. 

Let R be a commutative ring and call u€ R a nonzero-divisor if ur=0, rE R 
implies r=0. Let UCR be a set of nonzero-divisors in R such that 1€U, 
and ab € U whenever a,b € U. Generalize Theorem 5 by showing that a ring of 
quotients Q = {z |jreR,ueUu } exists. Show further that R can be regarded 
as a subring of Q and, in this case, that each element of U is a unit in Q and 
Q=frut|reRuev}. 

If R is a ring, recall the definition of the ring R(w) preceding Example 5 where 
w*= —land rw =wr for allr € R. 

(a) Is C(w) a field? What about Zs(w)? Z7(w)? 

(b) If R is commutative, show that R(w)* = {r+ sw |r? +s? € R*}. 

(c) If p is a prime and p = 3 (mod 4), show that Z,(w) is a field of order p?. [Hint: 
Corollary to Theorem 8 §1.3.] 

(d) If R is an integral domain in which 240, show that R(w) has no nonzero 
nilpotents. 

(e) If R is an integral domain in which 2 € R*, show that pune idempotents in R(w) 
are 0,1, ‘and 4+ sw, where (2s)? = —1. 

(f) Show ane Rw) = [: a r,8€ Rh, a subring of M2(R). 

Let p and z denote aistenntons and letta, . € R/ Show that 

(a) (q*)* = 

(b) (ap + ba) = ap* + bq" 

(c) N(q) =aq° = a°q See, 

(d) (pq)* = q*p* [Hint: First show that (ig)* = —q*i, (jq)* =—@*j, and (kq)* = 
—q*k, and then use (b).] 

(ec) N(pq) = N(p)N(q) (Hint: (c) and (A), | 

Write 1 =(5 a t=) ap j=(2 ae and k -[{ >in M2(C). Show that 


0 


(a) 2 = 7? =k? = ijk = -1. 
. bi 
(b) atbircj+dk=| Or", anti all a,b,c, and d in R. 


(c) atbitcj+dk=a'+bit+cj+dk if and only if a=a’,b=U,c=c, and 

d=d'. 

(d) ai = ta, aj = ja, and ak = ka for alla e€ R. 

(e) H is isomorphic to {a + bt + cj + dk | a,b, c,d € R}. 

If R is commutative and H(R) = {a+ bi + cj + dk | a,b,c,d € R}, we declare that 
atbit+et+dk=a'+bit+cj+idk 

if and only if a=a’,b=0',c=c’, and d=d’. As for the quaternions, the addition 

and multiplication in H(R) are determined by the ring axioms, the conditions 

that 7 = 49? =k? =ijk =—1, and the conditions ai=ia, aj = ja, and ak = ka 

for all aE R. If gq=a+bi+cj+dk in H(R), define g* =a—bi—cj—dk, and 

N(qg) =a? +0? +c? 4d’. 

(a) Show that q is a unit in H(R) if and only if N(q) is a unit in R. 

(b) Show that HI(R) is a division ring if and only if Fis a field and a? + b? +c? + d?=0 

in R implies that a=b=c=d= 0. Is H(R) a division ring if R= C, Zo, Zs, Zs, 

Zr, or Zx1? 


180 8. Rings 


(c) Let Ao(R) ={r¢€ R| 2r =0}. Show that Z[H(R)] = {a+si+tj+uk lac R, 
s,t,u € Ao(R)}. Describe Z[H(Ze)]. Show that H(R) is commutative if and only if 
R has characteristic 2. , 

(d) Show that q? — 2aqg+ N(q) =0 for allg=a+bi+cj + dk in H(R). 


3.3 IDEALS AND FACTOR RINGS 


Let R be a ring and let A be an additive subgroup of R. Then A is normal in the 
(abelian) additive group (R,+), so we obtain the additive factor group 


R/A={r+A|reR} 


where the (additive) cosets are defined by r+ A={r+a|ae€ A}. The essential 
features of the arithmetic in R/A are collected in Lemma 1 for reference; of course, 
they are just translations of the same properties for multiplicative groups. 


Lemma 1. Let A be an additive subgroup of a ring R and let re eR. The 
following assertions are valid in the factor group R/A. 

(1) r+A=s+A if and only ifr—seEA. 

(2) (r+A)+(s+A)=(r+s)+A. 
(3) 0+ A=A is the (additive) unity of R/A. 
(4) —(r + A) =—r+ A is the (additive) inverse of r + A. 
(5) k(r + A) =kr+A for all k € Z. 
_ In our construction of Z, = Z/nZ in Section 1.3, the cosets were written as 
k=k+nZ, k € Z, and we turned Z, into a ring via the multiplication km = km. 
If A is any additive subgroup of a ring R, this suggests defining multiplication in 
R/A by (r + A)(s+ A) =rs+ A. However, this multiplication is well defined only 


for rather special subgroups A. To describe them we adopt the following notation: 
For any element a in R, write 


Ra = {ra|r eR} and = aR={ar|reé R}. 


Lemma 2. Let A be an additive subgroup of a ring R. The following are equivalent: 
(1) The multiplication (r + A)(s+ A) =rs+A is well defined on R/A. 
(2) Ra C A and ak C A for every a in A. 


Proof. (1) = (2). Note first that (1) turns R/A into a ring. Hence if r € R and 
a € A then, using (1) and Theorem 1 §3.1, we obtain 


ra+A=(r+A)(at+ A) =(r+A)(0+A)=704+A=04+A=A. 


This implies that ra € A, so Ra C A. Similarly, aR C A. 
(2) > (1). Ifr+A=r'+A and s+A=s'+A, we must show that rs+A=r's'+A. 
We have r—r’ € Aand s—s’€ A, so 


rs —r's' =r(s—s')+(r—r')s' € Ris—s')4+(r—-r)RCA 
by (2). Hence rs + A = r's' + A, as required. a 


/ 3.3, Ideals and Factor Rings 181 


An additive subgroup A of a ring R is called an ideal® of R if 
Ra C A and aR C A for every a € A; 
that is, if every multiple of an element of A is again in A. 


Theorem 1. Let A be an ideal of the ring R. Then the additive factor group R/A 
becomes a ring with the multiplication (r + A)(s +A) =rs+ A. The unity of R/A 
is 1+ A, and R/A is commutative if R is commutative. 


Proof. Because A is an additive subgroup, R/A is an additive abelian group. The 
multiplication is well defined by Lemma 2. Verification that it is associative, that 
1+ A is the unity, and that the distributive laws hold is left to the reader, along 
with the proof that R/A is commutative if R is (Exercise 3). | 


If A is an ideal of a ring R, the ring R/A in Theorem 1 is called the factor 
ring of R by A. This definition should be compared to the definition of factor 
groups in Section 2.9. Clearly, ideals play a role in ring theory analogous to normal 
subgroups in group theory, each yielding the construction of a factor structure using 
cosets. Note, however, that although normal subgroups of a group are certainly 
subgroups, most ideals are not subrings. It is true that ideals of R are closed under 
multiplication (and so are general subrings) but, as the next theorem shows, the 
only ideal that contains the unity of R is R itself. 


Theorem 2. The following are equivalent for an ideal A of a ring R. 


(1) 1€A. 

(2) A contains a unit. 

(3) A= R. 
Proof. (1) = (2) and (3) = (1) are obvious. If u € A is a unit then 1 =u tue A 
because A is an ideal. Hence, r=r-1€ A for all r € R, proving (2) = (3). @ 


Example 1. If R is any ring, {0} and R are ideals of R, and the factor rings are 
R/{0} = R and R/R & {R}—the zero ring with one element. The ideal 0 = {0} is 
called the zero ideal of R, and any ideal A # R is called a proper ideal of R. 


Example 2. Ifn > 0, then nZ is an ideal of Z and Z/nZ = Zy if n > 2. 


Note that every additive subgroup of Z has the form nZ for some n > 0, so every 
additive subgroup of Z is an ideal. In fact, Z and Z,, n > 2, are the only nonzero 
rings having this property. 


Example 8. If a€ Z(R), show that Ra=aR, and that this is an ideal of R 
called the principal ideal generated by a.*® 


Solution. First, Ra is an additive subgroup of R because ra + sa = (r + s)a, 0 = 0a, 
and —(ra) = (—r)a. Ifa = sa € Ra andr € R, then rz = r(sa) = (rs)a € Ra. Note 


54In the nineteenth century it was observed.that the prime factorization theorem for the ring Z 
of rational integers did not extend to certain subrings of C. Ernst Eduard Kummer showed that 
unique factorization was achieved for what he called ideal numbers. The term ideal was first used 
by Richard Dedekind who realized that the ideal numbers could best be described as ideals in the 
modern sense. 

55In a commutative ring FR, the ideal Ra of R is often denoted (a). 


182 8. Rings 


that we have not yet used the fact that a € Z(R). But this is needed to show that 
gr = (sa)r = sra € Ra for allr € R. Hence, Ra is an ideal. Clearly Ra=ak. O 


Note that, if a € Z(R), the ideal (a) = Ra = aR in Example 3 contains a and is 
contained in every ideal of R that contains A. Hence, it is the smallest ideal of R 
containing a. Ifa ¢ Z(R), the description of this smallest ideal containing a is more 
complex (see Exercise 27). 


Example 4. If a€ Z(R), show that ann(a) = {r € R|ra = 0} is an ideal of R, 
called the annihilator of a. 


Solution. The set ann(a) is an additive subgroup because 0a = 0, and ra = sa = 0 
implies that (r+ s)a = 0 = (—r)a. Ifra = 0 andt € R, then (tr)a = t(ra) = 0, and 
(rt)a = rat = 0 because a € Z(R). Hence, tr € ann(a) and rt € ann(a). O 


An ideal A of a ring R is a general ring but it contains the unity of R only 
if A= R by Theorem 2. However, if e? =e is a central idempotent in R then 
A= Re is an ideal by Example 3, and e is the unity of A by Theorem 6 §3.1 (in 
fact Re = eRe). This observation has a converse that we need later. 


Example 5. Let A be an ideal of a ring R, and assume that A is a ring with unity 
e. Show that e is a central idempotent of R, and that A = eRe. 


Solution. Clearly e? =e because e is the unity of A. To show that e is central, 
let ré€ R and write a=er—ere. Then a€ A s0, since e is the unity of A, 
a= ae =ere—ere? =(0. Hence er=ere, and a similar argument shows that 
re = ere. Thus er = ere = re for all r € R; that is e is central. Finally, if a € A, 
then a=ae€ Re, so AC Re. Since Re C A because e € A, we have A= Re. 
Finally, Re = eRe because e is a central idempotent. Oo 


Example 6 illustrates how to carry out computations in a factor ring. 


Example 6. Let R= Z(i) be the ring of gaussian integers and let A= (2+i%)R 
denote the ideal of all multiples of 2+ 7%. Describe the cosets in R/A. 


Solution. A typical coset xz in R/A has the form x = (m+ ni) + A, where m,n € Z. 
Since 2+%€ A, we have i+ A=-—2+ A. Hence, x = (m—2n)+ A in R/A; that 
is, 
c=k+A, for some k € Z. 
This simplifies even further: Note that 5 = (2+4+12)(2—i)€ A, s05+A=0+A. 
Thus, ifk = 5q+7r,0<r<4,wegetw=k+A=(54+ A)(Q+A)+(r+A)=r+A 
in the.ring R/A. Hence, 
R/A={0+A,1+A,2+A,3+4+ A, 4+ A}. 

We claim that these five cosets are distinct. Suppose that r+ A= s+ A, where 
O<s<r<4. Then r—seA, say r—s=(2+1)(a+ bi) for some a,b€ Z. 
Taking absolute values gives (r — s)? = 5(a? +b’). As (r — s)? is 0,1, 4,9, or 16, the 
only possibility is r = s. Thus |R/A| = 5. Note, finally, that R/A & Zs is a field by 
Theorem 7 §3.1. 


An ideal P of a commutative ring R is called a prime ideal if P # R and P 
has the following property: 


lfrse P, thnre PorseP. 


3.3. Ideals and Factor Rings 183 


Recall that a commutative ring R is an integral domain if and only if rs =0 
implies r = 0 or s = 0, that is if and only if 0 is a prime ideal in R. The following 
characterization of prime ideals is a basic fact in the theory of commutative rings. 


Theorem 3. If R is a commutative ring, an ideal P # R of R is a prime ideal if 
and only if R/P is an integral domain. 


Proof. If R/P is an integral domain and rs € P, then (r+ P)(s+ P)=rs+P=P 
is the zero of R/P, so either r+ P=P or s+ P=P. Hence r€ P or s€ P, so 
P is a prime ideal. Conversely, if P is a prime ideal, let (r+ P)(s +P) =P, the 
zero of R/P; that is rs + P =P. Hence, rs € P, sor € P or s € P because P is 
a prime ideal. Thus r-+ P = P or s+ P = P, proving that R/P is a domain. It is 
commutative because R is commutative. | 


Example 7. Ifn > 2 in Z, show that nZ is a prime ideal if and only if n is a prime. 


Solution. Here, Z/nZ = Zp, which is an integral domain if and only if n is a prime 
(Theorem 7 §1.3). Hence, Theorem 3 applies. 0 


We now describe all the ideals of a factor ring R/A in terms of the ideals of R 
which contain A. 


Theorem 4. Let A be an ideal of a ring R. 
(1) If B is an ideal of R with ACB then B/A={b+A|beB} is an 
ideal of R/A. 
(2) If B is any ideal of R/A then B= B/A for some (unique) ideal B of 
R with AC B. In fact, B= {bE R| b+ AEB}. 
(3) If B and By are ideals of R that contain A, then 


BCB, ifand only if B/AC B,/A. 


Proof. (1) This is a routine verification that we leave to the reader. 

(2) Given an ideal B C R/A, let B= {be R| b+A € B}. Then B is an ideal of 
R (verify), and we have A C B because a+ A=0+A€EB for all a € A. Hence it 
remains to show that B = B/A. We have B C B/A because r + A € B implies that 
r € B,hencer + A € B/A. Conversely, ifr + A€ B/Athenr+A=b+ A for some 
be B. But 6+ A € B because b € B, that is r+ A € B, and we have shown that 
B/A C B. Hence B/A = B, as required. 

(3) If BC B, it is clear that B/ACB,/A. For the converse, assume 
that B/AC B,/A, and let b€ B. Then 6+A€EB,/A, say 6+ A=D,+A 
for some b; € By. Hence, b&b, +AC B, because ACB, so BC B, as 
required. | 


Simple Rings 


By analogy with groups, a ring R is called a simple ring if R #0 and the only 
ideals of R are 0 and R. 


Example 8. Show that every division ring is simple. 


Solution. Let A # 0 be an ideal in a division ring R. If0 #r € A, then r is a unit 
(because R is a division ring), so A = R by Theorem 2. O 


184 3. Rings 


There are simple rings that are not division rings (Theorem 7 below), but such 
rings must be noncommutative by the next result. 


Theorem 5. If R is commutative, then R is simple if and only if it is a field. 


Proof. Every field is simple by Example 8. Conversely, if R is simple and commu- 
tative, let O#a€R. Then Ra={ra|re€ R} is an ideal of R (by Example 3). 
Because Ra # 0 (as a € Ra), the simplicity of R shows that Ra = R. Thus 1 € Ra, 
so 1 = ba for some b € R. Hence, a is a unit in R, so Ris a field, as required. @ 


The simple rings are closely related to the following class of ideals. An ideal Mf 
in a ring R is called a maximal ideal of R if M + R and the only ideals A of R 
such that MC AC Rare A=M and A=R. 


Theorem 6. Let A be an ideal of a ring R. Then A is maximal in R if and only if 
R/A is a simple ring. 


Proof. Assume that A is maximal and (using Theorem 4) let B/A be a nonzero ideal 
of R/A, where B is an ideal of R with A C B. Since B/A #0, letO#b+ AE B/A 
where b € B. Thenb € B but b ¢ A, so A# B. Thus, B = R by the maximality of 
A, hence B/A = R/A. This shows that R/A is simple. 

Conversely, if R/A is simple, let AC BC R where B# A is an ideal of R. 
Then B/A is an ideal of R/A by Theorem 4, and B/A #0 because B # A. Hence, 
B/A=R/A by the simplicity of R/A, and so B = R by (3) of Theorem 4. This 
shows that A is a maximal ideal of R. | 


Combining Theorems 5 and 6 gives 


Corollary 1. If R is a commutative ring, an ideal A of R is maximal if and only 
if R/A is a field. 


The fact that every field is an integral domain, together with Theorem 3, gives 
Corollary 2. Every maximal ideal of a commutative ring is a prime ideal. 


Note that the converse of Corollary 2 is false: In the ring Z of integers, the zero 
ideal is prime by Theorem 3, but it is not maximal by Corollary 1 because Z is an’ 
integral domain that is not a field. 


We conclude this section by constructing some simple rings other than division 
rings. In fact, we verify that M/,,(R) is a simple ring if R is a division ring. The 
proof requires some preliminary remarks about certain special matrices. 

Let R be a ring and let n > 1 be a fixed integer. If 1 <7, 7 <n, let E,; denote 
the n x n matrix with (i, j)-entry 1 and all other entries 0. The matrices E;; where 
1<i,j <n, are called matrix units in M,,(R). Thus, the matrix units in M2(R) 


are 
Pu =|) 6} 22=[9 o] 2=[) o) and Ba =[5 2]: 


In general there are n? matrix units in M,,(R). 
If A is an ideal of a ring R, it is easy to show that M,,(A) is an ideal of M;,(R). 
The converse holds too. To prove it, define the scalar product rA, r € R, by: 


If A = [a,;] is any matrix and r € R, define rA = [raj,]. 


3.3. Ideals and Factor Rings 185 


Thus, r£;; is the matrix with (¢,7j)-entry r and all other entries 0. Hence, every 
matrix is a “linear combination” of the matrix units F,;, where 1 <i,j <n: 


[2:3] = Di, jaij Big. (*) 
Moreover, matrix multiplicatjon gives the following useful formula for r,s € R: 
rsBiy, ifj =k, 
0, iff #k. 
Lemma 3. Every ideal of M,,(R) has the form M,(A) for some ideal A of R. 


Proof. If A is an ideal of M,(R), let A= {ae R| ak, € A}. Then A is an 
ideal of R (verify), and we show that A= M,(A). To see that M,(A) C A, let 
X€M,(A), say X = [apg] where apg¢ A for 1<p,qg<n. Then (**) gives 
ApgE pg = Epi(QpqF11) Eig € A because apg; € A. Then, using condition (*), we 
have X = [dpq] = Up,q4pgEpq € A, proving that M,(A) C A. 

Conversely, let B = [b;;| € A. We must show that bp, € A for all p and q, that 
is bpgHy1 € A. Since B = 4,,;b;; Ei; by (*), we compute 


ExpBE gi = Erp (Bi,jbig Biz) Eg = Yi,j Erp (bij Hig) Eg = Opg Er 
by (**). Hence, bpg fi = FipBEg € A because B € A and A is an ideal. |_| 


(rEij)(sEm) = bas) 


Theorem 7. If R is a ring then M,(R) is simple if and only if R is simple. 


Proof. Assume that R is simple. If A is an ideal of M,,(R) then Lemma 3 shows that 
A =M,,(A) for some ideal A of R. Hence, A is 0 or R because R is simple, 
whence A= M,(0) =0 or A= M,(R). This shows that M,(R) is simple. The 
converse is proved similarly, and we leave it to the reader. | 


Corollary. If R is a division ring then M,(R) is simple. 


Thus, for example, M2(Z2) is a simple noncommutative ring that is not a division 
ring. In fact it can be shown to be the smallest such ring. 

Part of the importance of Theorem 7 is that it gives half of another theorem of 
Wedderburn: a “sufficiently small” ring S is simple if and only if S & M,(D) for 
some division ring D and some n > 1. We explain what “sufficiently small” means 
in Section 11.2 (being finite is enough), and give a proof of a more general version 
of the theorem. 


Richard Dedekind (1831-1916). Richard Dedekind, the son of a law professor, was 
born in Brunswick, Germany, the birthplace of Gauss. He obtained his Ph.D. at Gottingen 
at the age of 21 and was Gauss’ last student. After a stay in Zurich, he returned to the 
technical high school at Brunswick, where he remained for 50 years. He never married 
and lived with his sister until his death. 


Dedekind had wide mathematical interests. He became disturbed by the lack of a precise 
foundation for the set R of real numbers, and he filled this gap with his now-famous 
Dedekind cuts in a paper in 1872. His work in algebra also was of first importance. He 
lectured on group theory before Jordan, and stated the Peano axioms before Peano. 
Dedekind was one of the first to understand Galois theory and made fundamental con- 
tributions to the theory of group characters. He also extended earlier work of Kummer. 
The unique factorization of integers into primes is not true of elements in other integral 


186 3. Rings - 


domains, and Kummer had shown that the uniqueness could be retrieved if certain ideal 
numbers were used. Dedekind coined the term idea/ and studied integral domains (now 
called Dedekind domains) where all ideals factor uniquely as a product of prime ideals. 
This work influenced Emmy Noether, thereby changing the course of modern algebra. 
Dedekind also did pioneering work in the theory of rings, groups, and fields and has been 
called (by Morris Kline) “the founder of abstract algebra.” 


Exercises 3.3 


Th 


1. 


a 


10. 


11. 


roughout these exercises R denotes a ring unless otherwise specified. 


In each case decide whether A is an ideal of the ring R. Support your answer. 


(a) R=C,A=Z (b) R=ZxZ, A= {(k,k) | k € Z} 
(yr=[§ R),4=[o 8] @R=(3 Z].4=[2 3] 
() R=[5 a =|4 2 | (f) R=Z(i), A={n+niln eZ} 
. if R=|5 s jand Ax|; ah S any ring, show that A is an ideal of R and 


describe the cosets in R/A. 

. If A is an ideal of R, complete the proof of Theorem 1 by verifying that 
(a) 1+ A is the unity of R/A. 

(b) The associative and distributive laws hold in R/A. 
(c) If R is commutative, so also is R/A. 

. (a) If m is an integer, show that mR ={mr|réR} and A, = {r € R| mr =0} 
are ideals of R. 

(b) If R = Z,,, show that every ideal of R has the form mR for some m € Z. 

. (a) If A is an ideal of R and B is an ideal of S, show that A x B is an ideal of R x S. 
(b) Show that every ideal A of Rx S has the form A=A~x B as in (a). [Hint: 
A={a€éR| (a,0) € A}. 

(c) Show that the maximal ideals of R x S are either of the form A x S where A is 
maximal in R, or of the form R x B, B maximal in S. 

. If A is an ideal of R, show that M2(A) is an ideal of M2(R). 

. Show that Z x 0 and 0 x Z are prime ideals of Z x Z. 

. If A and B are ideals of R such that AN B = 0, show that ab =0 = ba for allac A 
and be B. 

. Let R=Z(i) be the ring of gaussian integers. In each case find the number of 

elements in the factor ring R/A and describe the cosets. 

(a) A= Ri (b) A= R(1—-2) 

(c) A= R(1 + 2%) (d) A= R(1 +4 32) 

[Hint: (1+ 2é)(1 — 4) = 3+ and (1+ 3i)(1 — 31) = 10] 

If R is a simple ring, show that Z(R) is a field. Show that the converse is not true 


0 F 
(a) If R is a simple ring and n € Z, show that either nR=0, or nr=0, re R, 


implies that r = 0. 
(b) Conclude that R has characteristic 0 or a prime. 


by considering R = a > [where F is a field. 


12. 


13. 
14. 


15. 
16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


3.3. Ideals and Factor Rings 187 


If X C R is a nonempty subset of a commutative ring R, define the annihilator of 
X by ann(X) = {a € R| ax = 0 for all zx € X}. 

(a) Show that ann(X) is an ideal of R. 

(b) If X CY, show that ann(Y) C ann(X). 

(c) Show that ann(X UY) = ann(X) Nann(Y). 

(d) Show that X C ann[ain(X)}. 

(e) Show that ann(X) = ann{ann[ann(X)]}. 

Give an example where R/A is commutative but R is not. 

If X and Y are additive subgroups of R, define X+Y ={xe+y|reX,yeY}. 

(a) Show that X + Y is an additive subgroup that contains both X and Y. 

(b) If A and B are ideals of R, show that A+ B is an ideal of R. 

(c) If A is an ideal of R and S is a subring of R, show that A+ S is a subring of R. 
If A is an ideal of R, show that ANS is an ideal of S for all subrings 9 of R. 

If A is an ideal of R, show that R/A is commutative if and only if rs — sr € A for 
allr,s€ R. 

Let Z = Z(R) denote the center of a ring R. 

(a) When is Z an ideal of R? Justify your answer. 

(b) If R is simple, show that Z is a field. 

(c) If R/Z is cyclic as an additive group, show that R is commutative. (This is the 
analogue for rings of Theorem 2 §2.9.) 

Let A, B and C be ideals of a ring R. 

(a) Show that AN B and A+ B= {a+b|aé€A, be B} are ideals of R. 

(b) If AC B and ACC, show that 8n $ = 292 and 84. = B40. 

If A is an ideal of R, show that R/A has no nonzero nilpotents if and only if r? € A 
implies r € A. [Hint: Exercise 11 §3.1.] 

Let R be a commutative ring. 

(a) Show that every maximal ideal of R is prime. 

(b) If R is finite, show that every prime ideal is maximal. 

(c) Is every prime ideal of Z maximal? Justify your answer. 

In each case show that, if R has the given property, so does any factor ring R/A. 
(a) Boolean (r? = r for all r € R). 

(b) Regular (for all r € R, rsr =r for some s € R). 

(c) Every element is a unit or a nilpotent. 

Let A be an ideal of R consisting of nilpotent elements. 

(a) If R/A has no idempotents except 0 and 1, show that R has the same property. 
(b) Ifu+ A is a unit in R/A, show that wu is a unit in R. 

(c) Show R/A has no units except 1 if and only if R*= 14 A. 

In each case find all maximal ideals of R. 

(a) R= Ls (b) R= Zg (c) R= Zo 

An additive subgroup F of R is called a left ideal if Ra C L for all a € L. Show that 
R is a division ring if and only if 0 and R are the only left ideals of R (extending 
Theorem 5). [Hint: Ra is a left ideal for each a € R.] 

Let R be a commutative ring. Write alb if b = ra for some r € R. 

(a) Show that Rab C RaN Rb for all a,b € R. 

(b) If Ra+ Rb = R (see Exercise 14), show that Rab = Ran Rb. 

(c) Show that u € R is a unit if and only if Ru= R. 


188 


26. 


27. 


28. 


29, 


30. 


31. 
32. 


33. 


34. 


35. 


36. 


3. Rings 


(d) Show that Rp is a prime ideal if and only if p|ab implies that pla or p\b. 
(e) If R is an integral domain, show that Ra = Rb if and only if a = ub for some 
unit u € R. 
Let A, B, and C be ideals of R and define 
AB= {a,b; + agbo +++: + andy | a,€A,b;€E Byn> 1}. 
(a) Show that AB is an ideal of R and ABC ANB. 
(b) Show that A(B + C) = AB+ AC and (B+ C)A= BA+CA. (Exercise 14.) 
(c) Show that AR = A= RA. 
(d) Show that A(BC) = (AB)C. 
If a€ R, write RaR = {riasy +rease +++: +7rnasy | 71,8; € Ryn > 1}. Show that 
Rak is an ideal of R containing a, and that it is contained in any such ideal. 
If e? =e € Rand A is an ideal of R, show that eAe = eRef A, that this is an ideal 
of eRe, and that every ideal of eRe occurs in this way. 


Let R=|5 ar F a field, show that 0, R, ER a PB an and E a are the 
only ideals of R. 

If X is an ideal of H(R) and 2 € R*, show that X = H(A), where A= XN R is an 
ideal of R. (See Exercise 32 §3.2.) 

Show that Ze(i) has a unique proper ideal A #0. 

(a) Show that Z3(V/2) is a field. 

(b) Show that Zo(/2) has a unique proper ideal A + 0. 

If R is commutative, let N(R) = {a € R| a is nilpotent}—the nil radical of R. 

(a) Show that N(R) is an ideal of R. [Hint: Theorem 4 §3.1,] 

(b) Show that N[R/N(R)] = 0. 

(c) Show that N(R) need not be an ideal if R is not commutative. 

(d) Show that N(R) is contained in the intersection of all prime ideals of R. (In fact, 
this is equality by Example 4 in Appendix C.) 

A ring R is called a local ring if the set J(R) of nonunits in R forms an ideal. 

(a) Show that every division ring R is local. Describe J(R). 

(b) If p is a prime, show that Zp) = {2% € Q| p does not divide m} is local. Describe 
J (Zp): 

(c) If p is a prime and n > 1, show that Zpn is local. Describe J(Zpn). 

(d) If R is local, show that R/J(R) is a division ring. 

(e) Let R be local and let A C J(R) be an ideal of R. Show that R/A is local and 
J(R/A)={r+A|re J(R)}. 

Let R be an integral domain and regard R C Q, where Q is the field of quotients 
(Theorem 5 §3.2). If P is a prime ideal of R, write M=R\ P={ue R|u¢ P}. 
(a) Show that 1 € M and M is closed under multiplication. 

(b) Show that Rp = {r/u|r € R,u € M} is a subring of Q. 

(c) Show that Rp is a local ring (Exercise 34) called the localization of R at P. 
Let A be an ideal of a ring R consisting of nilpotent elements and assume that R/A 
is a division ring. 

(a) Show that R is local (Exercise 34) and R* = R\~ A. [Hint: Example 17 §3.1.] 

(b) Show that (1+ A) <@ R* and R*/(1+ A) & (R/A)* as groups. 

(c) Assume that R is commutative and n € R* for alln > 2. Show that (A, +) 1+A 
as groups. [Hint: a +> e*; see the discussion following Example 17 §3.1.] 


3.4. Homomorphisms 189 
3.4 HOMOMORPHISMS 


A ring R is a set with the structure of an additive abelian group and a multiplicative 
monoid, together with the distributive laws. In this section, we are interested in the 
structure-preserving mappings 0: R — S, where S is another ring. In Section 2.10, 
’ the structure-preserving mappings from one group to another (the homomorphisms) 
turned out to be just those that preserved the operation. A ring has two operations, 
which suggests that 9: R — S is structure-preserving if it preserves both addition 
and multiplication. However, in a ring R, the unity 1p is also part of the structure, 
so we require @ to preserve the unity: 9(1z) = 1g. This requirement is automatic 
for groups but it can fail in general for rings (Example 6 below). 

If R and S are rings, a mapping 6: R — S is called a ring homomorphism 
if, for all r and r; in R: 


(1) O(r +11) = O(r) + A(72). 6 preserves addition 
(2) O(rr,) = O(r) - A(r1). 9 preserves multiplication 
(3) (1p) =1s. . § preserves the unity 


If R and S are general rings, the mapping @ is called a general ring homomor- 
phism if (1) and (2) hold, but possibly not (3). 


Example 1. If A is an ideal of R, the coset map R— R/A given by r+ r+A is 
an onto ring homomorphism. 


Example 2. The mapping kt k from Z to Z, is an onto ring homomorphism. 


Example 3. If R,; and Re are rings, the projections 7: R, x Re — R, and 
m2: Ry x Rg > Ry are onto ring homomorphisms, where m(ri,r2)=71 and 
m2(r1,T2) = T2. 

Example 4. lf @: ie q — Rx R is given by | 


is an onto ring homomorphism. 


r ss 


‘ | = (r,t), show that @ 


Solution. The reader should verify that 0 is an onto homomorphism of additive 
groups. We have 0 lc i = (1,1), so 6 preserves the unity. Finally 


offs ells e]}=ele “b"| =r.) = 0-0) 
= 45 3] [5 ¢]- 
Hence, @ preserves multiplication. O 


Example 5. If R and S are rings, let 6:R—S is an onto mapping that 
preserves addition and multiplication. Show that @ is a ring homomorphism, that 
is 0 ql R) = lg. 


Solution. The argument preceding Example 19 §3.1 goes through. im) 


Example 6. The mapping 0: R > Rx R, where 6(r) = (r,0) is a (one-to-one) 
general ring homomorphism that does not preserve the unity if R # 0. 


190 3. Rings 


Ring homomorphisms are homomorphisms of additive abelian groups, which 
gives the first three preservation properties in the next result. We leave the proofs 
of the last two as Exercise 10. 


Theorem 1. Let 6: R — R, be a ring homomorphism and let r € R. 


(1) 6(0) = 0. 6 preserves zero 
(2) 0(—r) = —6(r). § preserves negatives 
(3) O0(kr) = kO(r) for all k € Z. 6 preserves Z-multiplication 
(4) O(r”) = O(r)” for alln >0inZ ; 

serves powers 
(5) If we R*, O(u*) = 0(u)* for alk eZ me y 


By a rational expression in a ring R we mean a formula made up of letters 
representing elements of R that are combined using addition, subtraction, multipli- 
cation, division (by units), and multiplication by integers. Thus, r?su® — 3su7?r + 2 
is a rational expression where, of course, u is a unit in R and 2 means 2- 1p. Because 
of Theorem 1 (and the ring axioms), a ring homomorphism 6: R — S' preserves 
rational expressions. For example, if we write 0(x) = Z for every x € R, then 


6(r2su> — 3su-2r + 2) = Far> — 350-77 + 9. 


In particular, if r € R is a unit, an idempotent, or a nilpotent, the same is true of 
the element 7 = @(r) in Ry. 

The fact that ring homomorphisms preserve rational expressions is very useful. 
One reason is that, in many rings derived from a ring R (for example M,(R)), we 
define the operations using rational expressions from R. Hence, a ring homomor- 
phism R — S often induces a homomorphism of the derived ring in a natural way. 
Here is an example. 


Ezample 7. If §: R > S is a ring homomorphism, show that 6 : Mz(R) + M2(S) 
is also a ring homomorphism where 


al’ Jee ae for all i ‘| in M2(R). 


Solution. We leave to the reader the verification that @ preserves addition and the 
unity. For convenience, write 0(r) = 7 for all r € R. Then 


{lt i] |e a]} = 


ra+ se vest 


Hence, @ preserves multiplication, and so is a ring homomorphism. C 


Another way in which the preservation of rational expressions by homomor- 
phisms is useful is in showing that an equation in a ring R has no solution in R. 
The reason is that, if 0: R— S is a homomorphism and if an equation has a 


3.4. Homomorphisms 191 


solution in R, then (because @ preserves the whole equation) it has a solution in S. 
Thus by showing that no solution exists in S, we show that no solution can exist in 
A. This approach is useful because the ring S is often much simpler than R, so the 
task of showing that no solution exists is easier. We give two examples. 


Example 8. Show that x° — 5x? — 2 —17 = 0 has no solution in Z. 


Solution. Consider the homomorphism @ : Z — Zs given by 0(k) = k. Suppose that 
n € Zis a solution: n3 — 5n? —n —17 = 0. Applying 6 gives 7° — 5m? — 7 -17 =0 
in Zs; that is, 7? — 7 — 2 = 0. But #7 is one of 0, 1, 2,3, or 4 in Zs, and a direct check 
shows that none of these satisfies the equation n? — i — 2 = 0. Hence, no solution 
of the original equation could exist in Z. 0 


Example 9. Show that m3 — 6n® = 3 has no solution in Z. 


Solution. Our first temptation is to reduce this modulo 6, obtaining m? = 3 in Ze. 
But this has a solution (m = 3) in Ze, so there is no gain here. However, in Z7 the 
equation becomes m® + n° = 3. But the only cubes in Zy are 0,1, and 6, and the 
sum of two of these is one of 0,1,2,5, or 6. Because 3 is not in this list, there is no 
solution in Z7 and hence none in Z. O 


Our next theorem discusses an important homomorphism of rings of prime char- 


acteristic, that will be needed later. The proof depends on a fact about the binomial 
coefficients (2) = 4 Po that is important in its own right. 


Lemma 1. If p is a prime then p divides () for each k =1,2,...,p—1. 


Proof. The definition of (2) gives p! = (2) k!(p—k)!, so p divides the product 
(2) k!(p—k)!. Hence, Euclid’s lemma (Theorem 6 §1.2) shows that p must either 
divide (2) or divide some factor of k! (p — k)!. But this latter outcome is impossible 
because 1 < k < p—1, so p must divide (2) as asserted. | 


Theorem 2. Let R #0 be a commutative ring of prime characteristic p, and define 
yp:R—>R by y(r)=r? forallre R. 


Then ¢ is a ring homomorphism (the Frobenius Endomorphism). If R is a finite 
field then ¢ is an isomorphism (the Frobenius Automorphism). 


Proof. Clearly y(1) = 1, and y(rs) = y(r)y(s) because R is commutative. We have 
p(r + ) =rPt (?)rP-ts feeee Gear gpl + sP 


for all r,s € R by the binomial theorem. But p divides each of the coefficients 
@); Seay (e i} by Lemma 1, so each of these coefficients is zero in R because R has 
characteristic p. Hence, y(r + s) = y(r) + y(s), so y is a ring homomorphism. 

If R is a field, observe that ker y is an ideal of R (see Theorem 3 below). More- 
over, ker y # R because y(1p) = 1p. Hence, ker y = 0 because R is a simple ring, 
so is one-to-one (being an additive group homomorphism). If R is finite, then y 
is also onto (Theorem 2 §0.3), and so y is an isomorphism. | 


If Q is an infinite field, a ring homomorphism @ — Q can exist that is one-to- 
one but not onto. If p is a prime, let Q denote the field of quotients of the integral 


192 3. Rings 


domain Z,[x] (see Chapter 4). Then char @ = p so the Frobenius endomorphism 
q+ gq? is a one-to-one ring homomorphism Q — Q, but it is not onto. 


The lsomorphism Theorem 


A ring homomorphism 6: R — S is, in particular, a homomorphism of additive 
groups. Hence, it has a kernel and an image 


kerO={aER|O(a)=0} and iméO=904(R)= {O(r)|re R}. 


These are additive subgroups of R and S, respectively and, by Theorem 3 §2.10, 
6: R-—S is one-to-one if and only if ker? = 0. We also have the ring theoretic 
analogue of Theorem 1 §2.10. 


Theorem 3. Let 9: R- S be a ring homomorphism. 
(1) 6(R) is a subring of S. 
(2) ker @ is an ideal of R. 


Proof. (1) We know that @(R) is an additive subgroup of S, and it is closed 
under multiplication because (a) - 0(b) = 9(ab). Finally, our insistence that ring 
homomorphisms preserves the unity gives 15 = 6(1R) € 0(R). 

(2) Group theory shows that ker @ is an additive subgroup of R. If r € R and 
a €ker@, then (ra) = 6(r)-0(a) = O(r)-0=0. Thus ra €ker@ and, similarly, 
ar € ker @. Hence, ker @ is an ideal of R. | 


As for groups, part (2) of Theorem 3 has a converse: Every ideal A of a ring R is 
the kernel of some ring homomorphism R — S. In fact, the coset map py: R—- R/A 
is a ring homomorphism (in fact onto) with kery = A. 

We now come to the most important theorem of this section, the ring analogue 
of the isomorphism theorem for groups. 


Theorem 4. Isomorphism Theorem. Let 6: R —- S be a ring homomorphism 
and write A = ker @. Then 0 induces the ring isomorphism 


6: R/A > O(R) given by O@(r+A)=O(r) forallre R. 


Proof. The kernel A of @ is an ideal of R by Theorem 3, so R/A is a ring. Given a 
and b in R, compute 


a+A=b+A #& (a-b)EA & OAa-db)=0 & O(a) = 4(0). 


This shows that @ is well defined and one-to-one. Because 6 is clearly onto 0(R), it 
remains to show that @ is a ring homomorphism. Now 


9(1e/a) = O(1r + A) = O(1R) = 15 
is the unity of 6(R), so @ preserves the unity. For a,b € R, we have 
O[(a + A)(b+ A)] = (ab + A) = 0(ab) = 6(a) - 0(b) = O(a + A): 6(b + A), 


so 6 preserves multiplication. Similarly, 0 preserves addition and so is a ring iso- 
morphism. A 


8.4. Homomorphisms 193 


As for groups, the ring isomorphism theorem is very useful and reveals structure 
whenever it is used. We devote much of the remainder of this section to illustrations 
of how it is employed. We begin with three examples (Examples 10-12) involving 
specific rings. The general theme is: To show that A is an ideal of R and R/A&S, 
find an onto ring homomorphism 6: R — S with ker@ = A. 


! 
Example 10. Let A and B be ideals of R and S, respectively. Show that A x B is 


: RxSwRy S 
an ideal of R x S, and 433 & 4 x 3. 


Solution. Define 0: Rx S — “ x 3 by O(r,s) = (r+ A,s +B). Then @ is an onto 
ring homomorphism and ker @ = A x B, so the isomorphism theorem does it. O 


It is worth noting that every ideal of R x S has the form A x B, where A and B are 
ideals of R and S, respectively (Exercise 5(b) §3.3). Hence, Example 10 describes 
all homomorphic images of R x S. 

Similarly, every ideal of M,,(R) has the form M,,(A) for some ideal A of R, 
(Lemma 3 §3.3) so the next example describes all homomorphic images of M,(R). 


Example 11. If A is an ideal of a ring R, show that M,,(A) is an ideal of M,(R), 
and that 7} = M, (2). 


Solution. If r€ R, we write 7=r-+A in R/A for convenience. Then the coset 
map y: R— R/A, given by y(r) =F for all r € R, is an onto ring homomorphism. 
Hence, y induces the homomorphism 


gp: M,(R) - M, (¥) given by Plai,] = [a5] for all [a;;] e€ M,(R). 


Since y is a ring homomorphism, it is a routine verification that the same is true 
of @ (Example 7 is the case n = 2). Moreover ker ¢ = M,(A) because ker y = A. 
Now the isomorphism theorem applies. 2 


Example 12. If m|n, find an ideal A of Z, such that Z,/A = Zm. 


Solution. This can be solved directly by examining the factor rings of Z,, but 
(as is often the case) it is easier to let the isomorphism theorem do the work. 
Because Z, = {k+nZ|k € Z}, there is a natural map 0:Z, > Zp, given by 
0(k + nZ) =k-+mZ. This mapping is well defined because m|n: 


k+nZ=kh'+nZ => ni(k-k’) => mi(k—-k) => kt+mZ=k'+mZ. 


With this 6 is clearly an onto ring homomorphism, so we are done with A = ker @ 
by the isomorphism theorem. In fact, 


ker 9 = {k+nZ|k+mZ=mZ} = {mq+nZ|qe€Z}={mg|qeZ}=mZ,. 0 
Theorem 5. If R is any ring, then Z1pz = {klpr | k € Z} is a subring of R that is 
contained in the center of R. Moreover, 

(1) If R has characteristic n > 0, then Z1p = Zn. 

(2) If R has characteristic 0, then Zip = Z. 
Proof. Define @:Z-> R by 0(k)=k1p for all k EZ. This map is a ring ho- 
momorphism by Theorem 2 §3.1, so Zip = 6(Z) is a subring of R by Theorem 3. 


Moreover, Z1p is contained in the center of R because r(kig) = kr = (kl R)r for 
allr € Rand k € Z by the distributive laws and Theorem 2 §3.1 (verify). 


194 8, Rings 


We have ker6 = {k € Z|klp=0}. If R has characteristic n >0, then we 
have ker@ =nZ by Theorem 3 §3.1. Hence, Zig = 0(Z) =Z/nZ=Z, by the 
isomorphism theorem, proving (1). If R has characteristic 0, then ker @ = 0 and (2) 
again follows by the isomorphism theorem. | 


Theorem 5 is particularly important if R is a field. In this case a subring S of 
R is called a subfield of R if it is itself a field, or equivalently if s-! € S whenever 
0 #8 € S. Now the characteristic of a field R is either 0 or a prime p. If char R = p, 
Theorem 5 shows that R contains a central subfield Z1p & Zp. 

If char R = 0, the central subring Z1z is isomorphic to Z. In this case define 


Q = {uv | u,v in Z1p,v $ O}. 


This is easily verified to be a central subfield of R, and we claim that Q = Q. Indeed, 
the map y: Q — Q given by y(n/m) = (n1z)(m1g)~* is a ring isomorphism. We 
leave the verification to the reader with the observation that the proof that ¢ is 
well defined and one-to-one uses the following fact: Since char R = 0, ifn € Z then 
nlp = 0 if and only if n = 0. This proves the 


Corollary. Every field R contains a central subfield isomorphic to Z, or Q accord- 
ing as char R = p or char R = 0. 

Because of this result, the fields Z, and Q are called prime fields. They are 
important in field theory and we mention them again in Chapter 6. 


We can reduce many questions about general rings (with no unity) to the case 
of rings by a standard construction. If R is a general ring, consider the set 


Ri=ZxR 
and define operations on R? as follows: 
(n,r)+(m,s) = (n+m,r4+s) 
(n,r)(m, s) = (nm, ns + mr +7rs) 
Then R! is a ring with unity (1,0) as the reader can easily verify, and the mapping 
0:R!—Z defined by 6(n,r)=7n is an onto ring homomorphism for which 
ker 0 = {(0,r) |r € R}. The mapping ¢: R-— ker@ with o(r) = (0,r) is a one- 
to-one, onto general ring homomorphism (preserves addition and multiplication). 


Hence, we may regard R as a subset of R} by identifying r = (0,r) for all r € R. 
This being done, R is an ideal of R' and the isomorphism theorem gives 


Theorem 6. If R is a general ring, a ring R} exists, containing R as an ideal such 
that R1/R& Z. 


Decompositions of Rings 


When trying to ascertain the structure of a ring, it is useful to have a condition that 
ensures that a ring R is isomorphic to a direct product of two subrings. We need 
the following notion: If A and B are ideals of a ring R, define their sum A+ B by 


A+B={a+b|a€ A, be Bh. 


It is not difficult to show that A+ B is again an ideal of R, the smallest that contains 
both A and B. Our interest is in the case when A+ B= R. 


3.4. Homomorphisms 195 
Theorem 7. Let R be a ring with ideals A and B such that 
R=A+B and ANB= {oO}. 


Let 1=e+f in R wheree € A and f € B. Then 

(1) A and B are rings with unities e and f respectively (both central in R). 

(2) R@= Ax B as rings. 

Proof. (1) Ifa¢€ A then a=ae+af soa—ae=af ¢€ANB=0. It follows that 
a = ae; and similarly a = ea. Hence, e is the unity for A (and so e? =e). In the 
same way, f = f? is the unity of B. They are central in R by Example 5 §3.3. 

(2) Define 6: A x B > R by @(a,b) = a+. Then @ is onto because R= A+B, 
O(e, f) =1, and @ is easily verified to be a homomorphism of additive groups. 
Moreover, if 6(a,b) =0 then a= —b€ AN B=0, and it follows that @ is one-to- 
one. To see that @ preserves multiplication, note that if a¢ A and b€ B then 
ab € ANB =0, and similarly ba = 0. But then, if a’ € A and b'e B: 


6(a, b) - O(a’, b') = (a+ b)(a’ +B’) = aa’ + bb! = O(aa’, bb’) = O[(a, b)(a’, b’)]. 


Hence, @ is a ring homomorphsm, and so is an isomorphism. | 


There is a converse to part (1) in Theorem 7. If e? =e € R is central, then 
A=eR= Re = eRe is an ideal. Moreover, f = 1 — € is also central, and B = fRf 
is also an ideal. One verifies that R = A+ B and AN B =0, so Theorem 7 gives 


Corollary. If e? =e € R is central, then R © eRe x (1—e)R(1-e). 


Let A and B be ideals of a ring R. Theorem 7 characterizes when RF is isomorphic 
to A x B. Part (2) of the next theorem gives essentially the same result in the form 
that R is isomorphic to (R/A) x (R/B). 


Theorem 8. Chinese Remainder Theorem.®® Let A and B be ideals of R. 
(1) IfA+B=R then 72, = 4x €. 
(2) fA+B=Rand ANB=0 thnR2#x £. 


Proof. Since (1) implies (2) because a = R, we need only prove (1). Define 


wy:R- 


mla0 


x2 by V(r) =(r+A,r7+B) for allre BR. 


Then w is a ring homomorphism and ker yw = AM B. Hence, by the isomorphism 
theorem, it remains to show that w is onto. Since A+ B= R, write l=a+b 
where a € A and b € B. Given (s + A, t+ B) € 4 x & where s and ¢ are in R, let 
r= sb+ta. Then 

s—r=s(1l—b)-—ta=(s—t)hacA 


sos+Az=r-+ A. Similarly t+ B=r-+.B, and so ~(r)=(s+ A, t+ B). This shows 
that a is onto, as required. 


56The name derives from the fact that a special case (R = Z) of the theorem was known to the 
Chinese in the first century AD. 


196 8. Rings 


Corollary 1. If m and n are relatively prime, then Zmn = Zm X Zn. 


Proof. We are asking that Z/mnZ = Z/mZ x Z/nZ so, taking R= Z, A=mZ, 
and B = nZ in Theorem 8, we must prove that mZ + nZ = Z and mZN nZ = mnZ. 
The first follows from ged(m,n) = 1 since then 1 = mp + ng where p, g € Z, so 1 is 
in the ideal mZ + nZ. Hence, mZ+nZ = Z by Theorem 3 §3.3. 

If ke mZNnZ, then mlk and nlk, so mn|k, again because gcd(m,n) = 1 
(Theorem 5 §1.2). Hence, mZMnZ C mnZ,; the other inclusion always holds. & 


Corollary 1 has a useful application to number theory. Recall that we defined 
the Euler function y (at the end of Section 2.6) by taking y(n) to be the num- 
ber of integers in the set {1,2,-:-,2—1} that are relatively prime to n. Hence 
y(n) = |Z], and it is here that Corollary 1 comes into play. 


Corollary 2. If m > 2 and n > 2 are relatively prime, then y(mn) = v(m) - y(n). 


Proof. We have Zi,» = (Zm X Zn)* = Zi, x ZF, from Corollary 1, and the result 
follows because p(k) = |Zj,| for all k > 2. | 


Emmy Noether (1882-1935) Herman Weyl has described Emmy Noether as “a great 
mathematician, the greatest, | firmly believe, that her sex has ever produced, and a great 
woman.” She was born in Bavaria, the daughter of a well-known algebraist Max Noether. 
She completed her doctorate at Erlangen in 1907 and, in 1916, went to Gottingen to 
work with David Hilbert. Gottingen was then one of the leading centers of mathematics 
and, by 1930, Noether had established a fertile and influential research program that was 
recognized as the primary center of algebraic thought in the world. But, even with the 
enthusiastic support of Hilbert, she never attained more than an honorary professorship 
at Gottingen in part because she was a woman. With the rise of Hitler, she was forced 
to leave because she was a Jew, and she spent the last 2 years of her life at Bryn Mawr 
college in Pennsylvania. 


Her work touched several fields (general relativity and the calculus of variations, among 
others), but her genius flowered in algebra. However, she published comparatively lit- 
tle (she was most generous in sharing her ideas with others, especially her students). 
Even so, she created a whole new trend in algebra, emphasizing axiomatic concepts of 
great generality. To quote the Russian mathematician P.S. Alexandroff, “Emmy Noether 
taught us to think in a simpler and more general way; in terms of homomorphisms, of 
ideals—not in terms of complicated algebraic calculations. She therefore opened a path 
to the discovery of algebraic regularities where previously they had been obscured by 
complicated specific conditions.” Her 1921 paper on ideal theory was a landmark and 
has had a profound influence on ring theory and on algebra generally. It emphasized the 
fundamental importance of certain finiteness conditions, some of which can be traced 
back to Dedekind. As a result, rings satisfying the so-called ascending chain condition 
on ideals are now called noetherian rings. 


Exercises 3.4 
Throughout these exercises R denotes a ring unless otherwise specified. 
1. In each case determine whether the map @ is a ring homomorphism. Support your 


answer, 
(a) g: 23 — Zx2, where A(r) = 4r 


3.4. Homomorphisms 197 


(b) @: Z4 — Zr, where O(r) = 8r i 
(c)6:Rx R— R, where O(r,s) =r+s 
(d) @:Rx R—- R, where O(r,s)=rs 
(e) 0: F(R,R) — R, where 0(f) = f(1) 


. Let 0: R > S bea general ring homomorphism, where RF and S are rings. Show that 6 


is a ring homomorphism if: (a) @ is onto; (b) S is a domain and 0(1) # 0. 


. Show that a general ring homomorphism 6: Z — Z is either a ring isomorphism or 


O(k) = 0 for all k EZ. 
Determine all onto ring (general ring) homomorphisms Z12 — Zs. 


. If@:R— R; is an onto ring homomorphism, show that 0[Z(R)] C Z(R1). Give an 


example showing that this need not be equality. 


. If 6:R— Ry is a ring homomorphism and charR=n> 0, show that char R, 


divides n. 
Show that the composite of two ring homomorphisms is a ring homomorphism. 


. Let R and S be rings and let 6: R- S be a general ring homomorphism (that. is, 


6(1) may not be the unity of S). If 6(1) =e, show that e? =e in S, 0(R) C eSe, 
and 6: R — eSe is a ring homomorphism. 


. Describe the homomorphic images of a division ring. 

. Prove (4) and (5) of Theorem 1. 

. Show that «2° — 82? + 52 +3 =0 has no solution z € Z. 

. Show that m? + 14n3 = 12 has no solution in Z. 

. Show that 7m? + 11n? = 9 has no solution in Z. 

. Show that n° + (n+ 1)? + (n+ 2)9 = k? +1 has no solution in Z. 

. Ifo:R-—S is a ring isomorphism, show that the same is true of the inverse map 


oa ':S—>R. 


. Show that the set aut & of all automorphisms of R is a group under composition. 
. Show that the isomorphism relation & is an equivalence on the class of all rings. 


F 
. Let R= ff ] where F is a field. Determine all homomorphic images of R. 


F 
[Hint: Exercise 29 §3.3.] 


. Let 9: R— S be an onto ring homomorphism. 


(a) If A is an ideal of R, show that 6(A) = {8(a) | a € A} is an ideal of S. 
(b) If also ker@ C A, show that 4 & way: [Hint: Use the isomorphism theorem 
where a: R — aC ay is defined by a(r) = 6(r) + @(A) for all r € R.] 


. Ifn > 0 in Z, describe all the ideals of Z that contain nZ. 
. Show that there is no ring homomorphism C > R. 
. Let 6: RS be a ring homomorphism. If @(R) and ker@ both contain no nonzero 


nilpotents show that the same is true of R. 


. Let 6: R— S be aring homomorphism and let A C R and BCS be ideals. 


(a) If 0(A) C B, show that @ induces a unique 

ring homomorphism 6: R/A— S/B such that 6 
8 = y'6 as shown in the figure (where » and y’ 
are the coset maps). 

(b) Show that (a) applies where R and S are com- 
mutative and A = N(R) and B= N(S) are ideals R/A 0 S/B 
of all nilpotent elements. (See Exercise 33 §3.3.) 


198 


24, 


25. 


26. 


27. 


28. 


29. 


30. 


31. 


32 


33. 


34. 


35. 


36. 


37. 


&. Rings 
If u € R* consider the inner automorphism o,,: R— R defined by o,(r) = uru* 
for all re R (see Example 18 §3.1). Write innR={o,|u€ R*} for the set of 
inner automorphisms of R. 
(a) Show that inn R is a normal subgroup of aut R. 
(b) If Z = Z(R), show that 7M R* 4 R* and R*/(ZN R*) = inn FR as groups. 
If ab = 1 in R, write e = ba and define o : R— R by o(r) = bra. 
(a) Show that e? = e and that o : R — eRe is a ring isomorphism. 
(b) Use (a) to show that ab = 1 implies that ba = 1 if R is a finite ring. 


If Fido field “Gad-n fooscial ideal Moof R= { i ‘| | a,be F} , Describe R/M. 


Ss Ss 
0 Ss 


a is an ideal of Rand R/AZ Sx S. 


If p is a prime, let Zip) = {2 €Q|p does not divide m} and consider the set 
J(Zip)) = {2% € Zp) |p divides n}. Show that J(Zip)) is an ideal of Zp) and 
Zp) /J(Zip)) & Zp. (See Exercise 34 §3.3). 
Consider R(w) where w?= —1, as discussed preceding Example 5 §3.2. 
(a) If A is an ideal of R, show that A(w) is an ideal of R(w) and He = Bw). 
(b) Show that 3Z(z) is a maximal ideal of Z/(i). 
If A is an ideal of R, write R= R/A and =r+A,reéR. Ife? =e € R, show that 
eAe = eReN A, that this is an ideal of eRe, and that (eRe)/(eAe) & @Ré. 
Let R= S x T and write 5 = {(s,0) | s € S}. Show that 9 is an ideal of R, R/S & T, 
and S & § as rings. What is the unity of 9? 
Prove the Second Isomorphism Theorem: If A is an ideal of R and S is a subring 
of R, then $+ A is a subring, A and SMA are ideals of S+ A and S, respectively, 
and ($+ A)/A& S/(SN A). 
Prove the Third Isomorphism Theorem: If AC BCR, where A and B are 
ideals of R, then B/A = {b+ A|b€ B} is an ideal of R/A and (R/A)/(B/A) = R/B. 
Show that every additive subgroup of R is an ideal if and only if R&= Zor R=Z, 
for some n > 1. 
As in the discussion preceding Example 5 §3.2, define R(7) to be the set of all formal 
sums a+ bn, a,b € R, where n? = 0, an = na for alla € R, anda+bn=c+dy if and 
only ifa=c and b=d. 
(a) If A is an ideal of R, show that A(n) is an ideal of R(n) and 2 = Fn). 

a 


" A(n) 
(b) Show that R(7) & { i ‘]laee R\. 


(c) If R is a division ring, show that R(7) has exactly three ideals, 0, Ryn and R. 
[Hint: a + bn is a unit if and only if a #01] 

As in the discussion preceding Example 5 §3.2, define R(y) to be the set of all formal 
sums a+ by, a,b € R, where 7? = 1, ay = ya for alla € R, anda+by=c+dy if and 
only if a =c and b= d. 

(a) If A is an ideal of R, show that A(-7) is an ideal of R(y) and an = 2), 

(b) Show that R(y) & { EK Z| | a,beE R} ; 

(c) If R is a division ring, show that R(y) has exactly four ideals, 0, R(1+7), 
R(1~-) and R(y). 

Show that Z,, x Z, has a subring isomorphic to Z,, where ¢ = lem(m,n). 


be the upper triangular matrix ring over a ring S. Show that 


8.5. Ordered Integral Domains 199 


38. If R} is as in Theorem 6, show that R! & Z x R. [Hint: Corollary to Theorem 7 | 

39. Describe the maximal ideals in Ry x Ro x--: x Ry, where R; #0 for each 4. [ Hint: 
Example 10. 

40. Let R be a ring in which 2 € R* and PE Z(R) exists such ‘that u? = —1. Show that 
R(i) = Rx R. [Hint: Let e = 4(1+4 ui) in the Corollary to Theorem 7.] 

41. Let R be a ring in which 2 € R*, and u € Z(R) exists such that u* = 3. Show that 
R(V2) & Rx R. (Hint: Let e = 4(1 + uv2) in the Corollary to Theorem 7.] 

42. Let pw: R—- A x x be the map given by %(r) = (r+A,r+B)—-see the proof of 
Theorem 8. If ~ is onto, show that necessarily R= A+B. [Hint: Choose r in R 
such that #(r) = (1+ A,0+ B).] 

43. If X is aset and FR is a ring, let S = F(X, R) denote the ring of all mappings X — R 
using pointwise operations (see Example 4 §3.1). 
(a) If R is a field and x € X, show that {f € S'| f(x) = 0} is a maximal ideal of S 
for each x € X. 
(b) If M is a maximal ideal of R, show that {f € S| f(x) € M} is a maximal ideal 
of S. . 

44. Let A;, Ao,:-:,An be ideals of R and write A= ()] Aj. 


i=1 
(a) Show that R/A is isomorphic to a subring of R/Ai x ++: x R/An. 
(b) If. A;+A;=R for all i#j, show that R/A 2 R/A; x---x R/An. (Hint: 
Show that R= A, + ai Ay for each k by showing that this ideal contains 1. Let 
ix¢k 


= dp + bp, Ge € Ap, bh € [) Ai Given (71 + Ai,-++ 57 + An) in ae Xe xX, 
itk Ay A, 


consider r = 1b) + +++ +Tnbdn_] 


3.5 ORDERED INTEGRAL DOMAINS*’ 


The ring Z of integers is an integral domain that has the additional property of 
being ordered: For m and n in Z exactly one of m <n, m=n, or n < m is true. 
There are other ordered integral domains (for example Q or R), but the integers 
have the further property that they are well ordered: Every set of positive integers 
has a smallest member. This assertion is the well-ordering axiom for Z, which is 
equivalent to the principal of induction. The well-ordering axiom fails to hold for 
Q or R, and we devote this brief section to proving that it characterizes Z among 
the ordered integral domains. 

An integral domain R is said to be ordered if there is a subset R* C R, called 
the set of positive elements of R, satisfying the following conditions. 


Pl Ifa and b are in R*, thena+b and ab are in Rt. 
P2 For alla € R, exactly one ofa € Rt, a=0, or —a€ R* holds. 


Write a < bor b> ato mean b—a € R™. Hence, Z,Q, and R are ordered integral 
domains with the usual sets Zt, Q+, and Rt of positive elements. Note that we do 


57The material covered in this section is not needed elsewhere in the book. 


200 8. Rings 


not regard 0 as positive in Z, Q, or R, and we retain this convention in any ordered 
integral domain R (0 ¢ R* by P2). 


Lemma 1. Let R #0 be an ordered integral domain. 
(1) Rt ={reR|r> 0}. 
(2) Ifa € R, exactly one of a < 0,a=0, ora>0 holds. 
(3) Ifa<bandb<cinR, thena<ce. 
(4) Ifa<bandc>0O in R, then ac < be. 
(5) a® > 0 for alla #0 in R. In particular, 1 > 0. 


Proof. (1) follows from the definition of <, and (2) restates P2. If a < b and b <c, 
then b—a and c—b are in Rt, so c—a=(c—b)+(b—a) is also in Rt by Pl, 
proving (3). Similarly, (4) follows from P1 because (b — a) € R* andc € Rt implies 
that be —ac=(b—a)c € Rt. As to (5), if a #0, then a > 0 implies that a? > 0 
by (4), whereas a < 0 implies that —a > 0, so again a? = (—a)? > 0. Finally, 1 40 
because R #0, so 1=1?>0. |_| 


Lemma 1 shows that the complex numbers C cannot be ordered. For if Ct C C 
satisfies Pl and P2, then —1 = 7? € Ct and 1 =1? € C* by (5), contradicting P2. 

The well-ordering axiom (Section 1.1) is a potent property of the ring Z of 
integers, as we have seen. The next theorem shows that it distinguishes Z among 
the ordered domains. As for Z, we say that an integral domain is well ordered if 
it is ordered and every nonempty set X of positive elements has a least member c 
(that is, ce X andc <a for allz in X, x4 c). 


Theorem 1. Let R #0 be a well-ordered integral domain. Then an isomorphism 
o:2Z-— R exists such that, ifk <_m in Z, then o(k) < o(m) in R. 


Proof. We begin with two preliminary results. 
Claim 1. 1 is the least element of Rt. 


Proof. Let c be the least element of R*. Then one of 1 <c, c=1, or c> 1 must 
hold; 1 < cis ruled out because 1 € R*. If ce <1, then 0 <c<1,s00 <c? <c (by 
Lemma 1), Because c? € Rt, this contradicts the minimality of cin R+. Hence, the 
only possibility is c= 1. This proves Claim 1. 


Claim 2. Rt = {kl | k € Zt}. 


Proof. We first show that kl € Rt for all k € Z* by induction on k. It is true if 
k = 1 because 1 € R*. If k1 € R* for some k € Zt, then (k+1)1=k-1+1¢€ Rt 
by Pl, which proves that {kl|keZt+t}C Rt. If this is not equality, let d be 
the least member of {r € Rt |r #k1 for all k € Zt}. Because de R*, either 
d=1 or 1<d by Claim 1. But 1 <d means d—1€ R* and d—1 <d (because 
d—(d—1)=1€R*). Thus, the choice of d implies that d—1=k1 for some 
ke Z*, andsod=ki+1=(k+1)1, a contradiction. This proves Claim 2. 


We can now prove Theorem 1. Define o : Z— R by o(k) = k1. Then we have 
o(k +m) = o(k) + o(m) and o(km) = o(k) + o(m) for all k,m € Z (see Theorem 2 
§3.1), and k < m implies o(k) < o(m) because o(m) — o(k) =(m—k)1€ Rt by 
Claim 2. 


8.8. Ordered Integral Domains 201 


‘To prove that o is one-to-one, let o(k) = a(m). Then (k —m)1=0¢ Rt, so 


k<m by Claim 2. But (m—k)l = —(k —m)1 =0 too, so k >m. Hence, k =m 
and o is one-to-one. 


Finally, o is onto. If r € R, there are three cases: r= 0, r>0, and r <0. If 


r = 0, then r = o(0); ifr > 0, then r = o(k) for some k € Z* by Claim 2; ifr < 0, 
then —r > 0, so r = o(—k) for k € Z*. Hence, o is onto as required. 


Exercises 3.5 


1. 


Let R be an ordered integral domain and let a,b, and c denote elements of R. Show 
that 

(a) Ifa <b, thena+ce<b+c forallce R. 

(b) Ifa <b and c< 0, then ac > be. 

(c) Ifa < b, then —a > —8. 

(d) Ifa<bandc<d,thena+c<b+d. 

(e) If0<a<band0<c<d, then ac < bd. 

(f) If ab < ac and a> 0, then b<c. 


. Write a < 6 in an ordered integral domain R to mean a < b or a= b. Show that 


(a) a<aforallac R. 
(b) Ifa <b and b<a, thna=b. 
(c) Ifa<bandb<e, thena<e. 
Because of this, < is called a partial order on R. 
If R is an ordered integral domain, define the absolute value |a| of ae R by 


a, £O< 
laj= ; ‘ . 2 H Prove the following for all a and b in R. 
~a, ; 


(a) Ja| > 0 (b-la| <a < |a| 

(c) [ad] = |aj|d| (d) |Ja+}| < [al + |d| 

If R is an ordered integral domain and a € R, show that b€ R exists such that 
a <b. Conclude that R has no largest member. 

In each case, show that the integral domain R cannot be ordered. 

(a) Z(i)—the gaussian integers 

(b) Z,, p a prime 

Suppose that u > 0 and u? = 2 in an ordered integral domain R. Prove that 2u < 3, 
where 2=1+1 and3=2-+1. 

Let R be an ordered integral domain and let @ denote the field of quotients of R. 
Show that Q is ordered if Qt = {r/u|rue€ RT}. 


Chapter 4 


Polynomials 


One cannot escape the feeling that these mathematical formulae have an independent 
existence and an intelligence of their own, that they are wiser than we are, wiser even 
than their discoverers, that we get more out of them than was originally put into them. 


—Heinrich Hertz 


The study of polynomials is the oldest branch of algebra. The Hindus knew how 
to solve quadratics in 600 BC, and the Babylonians by then had developed consid- 
erable skill at algebraic manipulation and were using special cases of the quadratic 
formula. However, symbolic algebra in the form we know it today developed in 
Arabia between 600 and 1000 AD. They were solving cubic equations and, in the 
work of al-Khowarizmi (c.825), were starting to identify geometric magnitudes with 
numbers. These efforts led them to the familiar formulas for areas, volumes, and 
the like. By Descartes’ time (1596-1650), analytic geometry was well understood, 
so that the computational power of algebra and the intuitive power of geometry 
could each enhance the other. 

Subsequently, the theory of equations attracted the best mathematicians. Euler 
and Lagrange considered the problem of finding a general formula, analogous to 
the quadratic formula, for the roots of any quintic polynomial (degree 5). Their 
work led to the epoch making discovery of Abel who, in 1823, showed that no such 
formula exists. Later Galois showed it is impossible for any polynomial of degree 5 
or more, and brought groups into the picture. 

The general study of curves and surfaces as graphs of polynomials is known as 
algebraic geometry. A central problem here is to discover which properties of a curve 
or a surface remain invariant under certain transformations given by polynomials 
in the coordinates. This invariant theory dates from Cayley’s time (1821-1895) and 
continues to be an active research area today. 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


202 


4.1. Polynomials 203 


4.1 POLYNOMIALS 
/ 
The reader is doubtless acquainted with polynomials, having had to graph equations 
such as y=x?—2xr—2, obtain factorizations such as 6z?—11z+3=(2¢—3)(32—1), 
and find solutions (called roots) of equations such as x? — 2x — 2 = 0. Moreover, 
polynomials are associated with geometry. For example, the graph of y = 32 — 2 
is a line and the graph of y = 5x” — x +7 is a parabola. In addition, polynomials 
are treated as formulas for functions. For example, a function f :R—R could 
be defined by f(x) = 2° —x—1. In fact many readers will already know how to 
differentiate and integrate such polynomial functions, 
If R is any ring, a symbol z is called an indeterminate over R if 


ag + ay@ + ag@7+-+-+an2"=0,a;¢€R, implies a; =0 for each i. 


The study of polynomials requires the existence if such an element for any ring R. 
Clearly, if a9 + aj@ + agz? +---+an2” is to make sense, x and the a; must belong 
to some ring, and we begin by constructing such a ring. 


Lemma 1. Given a ring R, there exists a ring S with the following properties: 

(1) RCS is a subring. 

(2) There exists 2 € S such that x is an indeterminate over R. 

(3) Ifa € R then ax = xa. 
Proof. We sketch the proof; the details are in Section 4.6. A function a: N - R is 
called a sequence from R. If we write a(k) =a, for each k > 0, we denote this 
sequence by a = [a;) = [a0,41,@2,...). Given another sequence 6 = [b,), we have 
[a;,) = [b,) if and only if a = G, that is if and only if a, = by for all k > 0. 

If S is the set of all sequences from R, then S becomes a ring if we define 

[ax) + [bx) = [an + bx), 

(ax) [b,.) = (pe), Where py = Goby + aybe-1 +--+ + ag—1b1 + axbo for k > 0. 
Moreover, R is a subring of S' if we identify a = [a,0,0,0,...) for a € R, proving 
(1). This being done, define z = [0,1,0,0,...) € S. Then, with some calculation, 
we obtain ag + a,2 + Gor? + +++ +Gn2" = (ag, a1, Q2,-..,Gn,0,0,...) for all a; € R, 
and (2) follows. Finally az = [0,a,0,...) = za for all a € R, proving (3). a 


Let 2 be an indeterminate over a ring R, and let S be as in Lemma 1. Then 
Riz] = {a9 + a1% + agu? +--++an2"|n> 0, a; € R for each i}. 


is a subring of S, called the ring of polynomials over R. Here an expression 
f =ao+ayr + a92? +++-+a,2" in R[z] is called a polynomial over R. The 
elements a; in R are called the coefficients of f, and they are uniquely determined 
by f (a is an indeterminate). The polynomial f will often be written simply as 


f=agt+ayz+-:: 


where it is understood that the sum is finite, that is all coefficients are zero from 
some point on. Two polynomials 


f=ap tat +agz? +--- and = g = bop +b, a2 +box? 4+: 


204 4. Polynomials 


in the ring R{z] are equal (and we write f = g) if and only if f —g =0, that is 
(ao — bo) + (a1 — b1)a +--- = 0. Because z is an indeterminate, this means 


f=g9 ifand only if ap=b,x, forall k=0,1,2,---. 


Hence, for example, we cannot write 2x? — 3x + 1 = 0 in R[z] because it would mean 
2=0, -3=0, and 1=0. Instead we refer to finding a root in R of the function 
f:R-R defined by f(z) = 2z?—32+1, that is an element a € R such that 
0 = f(a) = 2a” — 3a +1. In this case a = 1 anda= ‘ are the only possibilities. 

Note that when writing a polynomial we omit terms 0z* where the coefficient is 
zero. For example we write 2x — 3x? rather than 0 + 2x + Ox? — 3x3. The coefficient 
ag (which may be 0) is called the constant coefficient of f = a9 +ajz+---. If 
all the other coefficients are zero, f = apo is called a constant polynomial, and 
the ring R is the subring of all constant polynomials in R[z]. 

The zero and unity of the ring R[x] are the constant polynomials 0 and 1, 
respectively. The negative of a polynomial f =a9+a,x+a927+--- is the 
polynomial —f = —ao — ajz — agx* — --- where we negate every coefficient of f. 

We single out the following facts for reference: 


Lemma 2. Let f=ag+aye+agz7+-+:-and g=bo+bix+box7 +--+. be 
polynomials in R[x] where R is any ring. Then 

(1) f =g if and only if a; = 0; for each i. 

(2) f +g = (ao + bo) + (a1 + b1)@ + (a2 + b2)a? ++. 

(3) fg = dobo + (dob: + a1bo)2 + (aob2 + a,b; + aabo)u? +++. 

(4) The coefficient of x* in fg is agby+arby_1+>+++an—1b1 + apbo = D> aid; 

itj=k 
Example 1. If f = aj +a,2 + aox? and g = bp + bi 2, then 
fg = abo + (aby + a1b9)x + (a,b; + agbo) x” + ab 2°. 

Example 2. In Z[z], (1 — 2@ +23)(2-—a+27) =2-—524+32?-at+a°. 
Ezample 3. In Zs[z], (« + 1)? =23 + 3x? + 32 +1=a3 +1 because 3=0 in Za[z]. 


The following theorem summarizes the above discussion. 


Theorem 1. Let R be a ring and let x be an indeterminate over R. Then: 
(1) Riz] is a ring. 
(2) R is the subring of all constant polynomials in R[z}. 
(3) If Z = Z(R) denotes the center of R, then the center of R[z] is Z[z]. 
(4) In particular, x is in the center of R[z}. 
(5) If R is commutative, then R|x] is commutative. 


Proof. We already have (1) and (2), and (3) implies (4) and (5). So we prove (8). 
(3). To see that Z{az] C Z(R[z]), let f= 2. + 22+ 2927 +--- € Zz], that is 
each z; € Z. Then z;b = bz; for every b € R, so Lemma 2 implies that fg = gf for 
each polynomial g € R[z]. It follows that Z{x] C Z(R[z]). Conversely, suppose that 
f=ao+ayz+agz?+--- is central in Riz]. Then fa=af for every a€ R, so 
Lemma, 2 shows that aja = aa;. This means each a; € Z, so Z(Riz]) C Z[z]. 


4.1. Polynomials 205 


Lemma 3. Let (x) = {xf | f € R[x}} denote tHe set of all multiples of x in Riz]. 
Then (x) is an ideal of R[x], and Riz]/(z) & R. 

Proof. Define 6: R[x] > R, by 0(a9 +aix+---)=ao. This is well defined by 
Lemma 2, and satisfies 9(1) = 1. If f=ap+ajx+--- and g=bo+biz+-:: in 
R[x], the constant coefficients of f +g and fg are ap + bo and agbo, respectively. 


This means that @ is an onto ring homomorphism. Since ker @ = (zr), it follows by 
the isomorphism theorem that (x) is an ideal of R[x] and R[{z]/ (x) & R. | 


If f is any nonzero polynomial in R[x], the highest exponent of x appearing in 
f (with nonzero coefficient) is called the degree of f, written deg f; the coefficient 
itself is called the leading coefficient of f. If the leading coefficient is 1, f is called 
monic. The degree of the zero polynomial is not defined. Polynomials of degree 
1,2,3,4, and 5 are called, respectively, linear, quadratic, cubic, quartic, and 
quintic polynomials. 


Example 4. The polynomial z — x” + 22° has degree 3, x +2 is monic of degree 
1, and —5 has degree 0. The polynomials in R[z] of degree 0 are just the nonzero 
constant polynomials, that is the nonzero elements of R. 


Suppose f #0 and g ¥# 0 are polynomials in R[z], say 
f=ag tae +agr?+++++amez™ and g = bot bye + bor? +--+ + bn2”, 


where a,, # 0 and b, #0. Thus, deg f = m and degg = n, and a» and b, are the 
leading coefficients. Clearly, 


fg = aobo + (aobi + aybo)a + +--+ (Ambn)a™™. 


It is possible that fg =0, but if fg #0 i follows that deg(fg) < deg f + deg g. 
However, if R is a domain, then a,b, # 0 and it follows that 


fg#0 and degfg=m+n=degf+degg. 
This proves (1) and (2) of the following theorem. 
Theorem 2. Let R be a domain. Then: 
(1) R[x] is a domain. 
(2) If f #0 and g #0 in R{z}, then deg(fg) = deg f + deg g. 
(3) The units in R[x] are the units in R. 


Proof. (1) and (2) are proved in the above discussion. 

(3). If f is a unit in R[x], denote its inverse by g. Then fg =1= gf, so (2) 
gives deg f + degg = deg1=0. But deg f and degg are nonnegative integers, so 
this implies that deg f = 0 = degg. Hence f and g are (nonzero) elements of R, so 
f is a unit in R. Conversely, each unit u in R is a unit in R[x] with inverse the 
constant polynomial u™. 


The next example shows that it is vital that R is a domain in Theorem 2. 


Example 5. Consider f = 1+ 2¢ in Z,[x]. Then the fact that 4 = 0 in Z, gives 


f? = (14+ 2x)(1 + 2) =1+4(24+2)e4 222? =1. 


206 4. Polynomials 


Hence f is a (self-inverse) unit in Z,[x] that is not in Z4, so part (3) of Theorem 
2 fails in Z,[z]. Moreover, (2) also fails because deg(f*) = deg1=0 whereas 
deg f + deg f =1+1=2. Finally, (1) fails because (2x2)? = 0 in Za[z]. O 


On the other hand, if F is a field it can be shown that Fz] x F[z] = (F x F)[z], 
so R = F x F is a nondomain for which the units in R[z] are in R. 

The proof of (1) and (2) in Theorem 2 extends to another important case where 
the degree function behaves well (Theorem 3 below). The proof is Exercise 7. 


Theorem 3. Let R be any ring and let f #0 and g #0 be polynomials in R[x]. 
If the leading coefficient of either f or g is a unit in R, then 


(1) fg #0 in Rie]. 
(2) deg(fg) = deg f + deg g. 


The Division Algorithm 


Our discussion of the factorization of integers in Section 1.2, and of the ring Z,, of 
integers modulo n in Section 1.3, both depend in a fundamental way on the division 
algorithm (Theorem 1 §1.2): Given m and n > 0 in Z, uniquely defined integers ¢ 
and r exist such that m= qn+r and 0<r<vn. The standard process of “long 
division” is an algorithm for computing g and r, and an analogous procedure works 
for polynomials, as shown in Example 6. 


Example 6. Given f = 2? +1 and g = 2*+4+ 32° +2+1 in Zz], find polynomials 
g and r such that 
g=af +r, and either r = 0 or degr < deg f = 2. 


Solution. The following tableau describes the process. 


zeo+ 34 —- I 

e+ilat? + 32 + Ge~ PS a1 
at + 2? 

3n5— ae 2x + i 

32% + 32 

— 2% — Ww + I 

— 2 - i 

- Ww + 2 


Hence, g = 2? + 32 — 1 and r = —2x + 2 in this case. The reader should verify that 
g=aqf +r really is true. 

The quotient g appears at the top and is created one term at a time from left 
to right. At each stage we choose the new term in q so that, when multiplied by 
the divisor x? + 1, the result has the same leading coefficient as the last polynomial 
in the tableau at that stage. The process stops when this operation cannot be 
achieved, that is, when the last polynomial in the tableau is either 0 or has degree 
less that the degree of the divisor (in this case, less than 2). This last polynomial 
is the remainder r = —2a + 2 above. 0 


This division process requires that the leading coefficient of the divisor is a unit; 
in fact, in most cases of interest the divisor is actually monic (as in Example 6). 
Apart from this requirement, the algorithm works in complete generality and the 
proof (by induction) is an adaptation of the algorithm itself. 


4.1. Polynomials 207 


Theorem 4, Division Algorithm. Let R be any ring and let f and g be poly- 
nomials in R[x]. Assume that f #0 and that the leading coefficient of f is a unit 
in R. Then uniquely determined polynomials q and r exist in R{z] such that 

(1) g=aftr. 

(2) Either r = 0 or degr < deg f. 


Proof. We first prove that such g and r exist. Write m = degg and n = deg f. If 
g=0orm<n, then g = Of +g does it. So assume that m>n and proceed by 
induction on m. Write f = ux" +axz""14.--- and g=br™+cax™-!+..., where 
u is a unit in R by hypothesis. Consider the new polynomial 
gy = g—buta™ f 
= (bo™ + c¢™ 1 4..-)— bu le™™(ux” + ag? 1 +--+) 


= 02™+(c—bulaja™ 1 4+.--, 


where we used the fact that x is central in R[z]. Hence either g; = 0 or degg; <m 
so, by induction, polynomials gq; and r exist such that g; = qi f +r and either r = 0 
or degr < deg f =n. But then 


ga gt bute" f = (gf tr) + bute f= (q+ beta) f +r. 


This completes the induction, so g and r exist satisfying (1) and (2). 

To prove uniqueness, suppose that also g = qi f +71, where either rj = 0 or 
degr; < deg f. Then r—1r; =(qi—q)f. If q1 —-qg#0 then, since the leading 
coefficient of f is a unit, Theorem 3 implies that (qi — q)f #0 and that 


deg(r ~ ri) = deg[(q: — 9) f] =deg(ai — ¢) + deg f. 


But this implies that deg(r — 11) > deg f, a contradiction. So, gq: — q = 0, whence 
r—r, =(q: —4q)f =0. This proves the uniqueness. a 


We use the division algorithm repeatedly. However, even though we proved it 
for an arbitrary ring R, it is most effective when R is commutative. The reason is as 
follows. Given a € R and a polynomial f = ap + 12 + agz? +++++an2" in Ris], 
we want to substitute a for x to get f(a) = a9 + aja + aga” +--+ + ana”. But we 
must be careful when doing this. 

To illustrate, suppose that a and b are given in a (possibly noncommutative) 
ring R, and consider the polynomial 

f= (e-a)(e+). 
Then we appear to have 


f(a) = (a—a)(a+ b) = 0(a +b) = 0. 
However, f = «7 + (b— a)x — ab when multiplied out, so 


f(a) =a? + (b—a)a ~— ab = ba — ab. 


This is clearly nonsense if ab # ba. The reason is that, in our construction of the 
ring Riz], we insisted that re = xr holds for all r € R. So substituting a for x in f 
is bound to create problems unless a commutes with all the coefficients of f. Hence, 
if substituting « =a is to make sense for all polynomials in R[z], the element a 
must be in the center of R. However, in this case substitution works well. 


208 4. Polynomials 


Theorem 5. Evaluation Theorem. Let R be a ring and let a be an element in 
the center Z(R) of R. Define a mapping y, : R[x] — R by 


Ga(ao + a12 + A227 +--+ An2") = ag + aia 4 aga? +++: + ana”. 
Then y, is an onto ring homomorphism. 


Proof, For a constant polynomial c, have ya(c + 0x + Oz? + ---) =, 80 Yq is onto 
and y(1) = 1. To show that y, preserves addition and multiplication, let 


f =anp +a,¢ +agz*+4+::- and g=bo tba + bon? +--- 
be polynomials in R[a]. Then f + g = (ao + bo) + (a1 + b1)z + (a2 + bg)? +--+ so 
Yalf +9) = (ao + bo) + (a1 + bi)a + (a2 + bg)a? +++: 
= (a9 +a1a + aga? +--+) + (bo + bia + boa” +---) 
= Palf) + Pag) 
Hence, vq preserves addition. Turning to multiplication, recall that 
fg=cotca+cor? +--- 
where the coefficient c, of ® is given by cy = agby + aibg_-1 +++: + apbo for each 
k > 0 (see Lemma 2). Because a is central in R, we have 
Yalf) ~a(g) = (ao + a1a t+ aga? +-++)(bo + bia + baa? +--+) 
agbo + (apbia + a abo) + (aba? + ajabya + a2a”bo) pees 
= agbo + (aobr + aybo)a + (adobe + a,b, + azbo)a” fees 
= co +cyat+coa? +--+: 


= palfg). 


Thus ~q preserves multiplication, and so is a ring homomorphism. | 


Let R be aring, let f =ag+a,x+agz7+-+:+an,2" be a polynomial in R{z], and 
let a € Z(R) be central in R. We write f(a) = ap +aia+---+a,a” for the element 
of R obtained by substituting a for z. Then f(a) is called the evaluation of f at a. 
For example, if f = 5 +42 — 2x? + 2° € Z[z] then f(3)=5+4-3-2-94 27=26. 


Example 7. Consider f = x? — x in Ze{zx]. Show that f(a) = 0 for all a € Zg. 


Solution. One verifies that a® = a for all a € Zg. Since Zs is commutative, we have 
f(a) =a° —~a=0 for all a € Z,[z]. O 


Example 8. If f =c is a constant polynomial, then f(a) = c for all a € Z(R). 


If a€ Z(R), the homomorphism y,: R[x] R in Theorem 5 is called the 
evaluation map because 


yvalf) = f(a), for all f € Riz]. 
The gist of Theorem 5 is that evaluation at a satisfies 
(f9)(a) = f(a)g(a) and (f+g)(a)=fla)+9(a), for all f,g € Riz]. 
Perhaps the most useful consequence of evaluation is 


Example 9. If f = gh in R[a], and if g(a) = 0 where a € Z(R), then f(a) = 0 too. 
In particular, if f = (z — a)h where h € R[x], then f(a) = 0. 


4.1. Polynomials 209 
o 


Solution. Since a is central in R, Theorem 5 gives f(a) = g(a) h(a) = 0A(a) = 0. If 
f =(z-a)h then f(a) = (a — a)h(a) = 0h(a) = 0. oO 

In Lemma 3, we showed directly that the mapping 6: Riz] > R is a ring 
homomorphism, where @(f) is the constant term of f. However, this follows easily 


from Theorem 5 because 6 = Y is evaluation at the central element 0. Here is 
another example. 


Example 10. Define 6: R[x] — R where 0(f) is the sum of the coefficients of f. 
Show that 6 is a ring homomorphism. 


Solution. If f=ao + ayx + ax +--+, then 6(f)=a9 +a; +a2+:+::=f(1). Thus 
9 = ¢; is evaluation at 1, and so is a homomorphism because 1 is central. O 


Commutative Rings 


Evaluation is most useful when R is commutative, so we make that assumption for 
the rest of this chapter. In this case, there is a special notation that is commonly: 
used. If a € R and R is commutative, the set Ra = {ra|r € R} of all multiples of 
a is an ideal of R, called the principal ideal generated by a, and denoted 


(a) = Ra= {ra|reé R}. 
We use this notation in the (commutative) ring R[z]. 


Theorem 6. Let R be commutative ring, let a € R, and let f € R[a|. Then 
(1) Factor Theorem. f(a) =0 if and only if and only if f = (a —a)gq for 
some q € R[x]; that is, if and only if f € (w—a). 
(2) Remainder Theorem. If f is divided by x — a, the remainder is f(a). 


Proof. Write f = (x —a)q +r by the division algorithm where g and r are in R[z] 
and either r=0 or degr < deg(x —a)=1. In both cases r € R is a constant 
polynomial. Now evaluation at a gives f(a) = (a—a)q(a)+r=r. This proves 
(2), and also shows that f = (x—a)q+ f(a). But then f(a) =0 if and only if 
f = (x—a)gq for some g € Riz], proving (1). Bo 


Corollary. If R is commutative, let gy, : R[x] + R be evaluation ata € R. Then 
ker yg = (x — a) and R[z]/ (x — a) = R. 

Proof. We have kerya = {f € R[x] | f(a) = 0} = (a — a) by the factor theorem. 
Now apply the isomorphism theorem. | 
Example 11. If f = 2x2°+2+1 € Ze[z], verify the remainder theorem for a = 2. 


Solution. The division algorithm gives f =(x—2)(2x?+4a+3) +1 (see the tableau), 
so the remainder is 1 = f(2), as required. 
Q2a7 + dn + 8 


e—2[ 22° + ze+iitl 
227 + 2x 
ee a 
4e? + Ae 
3x2 + éd 
32 


210 4. Polynomials 


Let f € R[x], where R is commutative. Then a € R is called a root of f if it 
satisfies the following conditions (equivalent by the factor theorem): 


(1) f(a) =0. 
(2) f = (a—a)q for some g € Riz]. 
(3) fe (e—a). 


Thus, every element of & is a root of the zero polynomial, while a nonzero constant 
polynomial has no roots. 

Suppose f #0 has degree n. If a is a root of f then f = (x —a)qi in R[x] by 
the factor theorem. Moreover, deg gy = n — 1 by Theorem 3. If g(a) = 0, another 
application of the factor theorem gives f = (rz —a)*q2 where degqg =n— 2. If 
qo(a) = 0, the process continues. Because the degrees of the quotients qi, q2,... 
decrease, the process must end with gm(a) #0 for some m > 0. This leads to the 
following terminology: If f = (x — a)'g, where g € R[z] and g(a) # 0, the root a is 
said to have multiplicity m > 1. 


Example 12. In Zg[x], find the multiplicity of 2 as a root of the polynomial 
f=at+5e3 +32? 44. 


Solution. We have f =(z — 2)q, by the division algorithm, where gq; =2°—2?+2+2. 
But gi(2)=0 too, so q, = («—2)q2, where qgqg=2*+2+3. As go(2) #0, the 
multiplicity is 2 and f = (2 — 2)?(z? +2 +43). oO 


Examples 13 and 14 show how the number of roots of a polynomial depends on 
the ring. 


Example 13. The polynomial xz? + 1 has no roots in R but two in C, i and —1. 


Example 14. Consider the polynomial x? —1. It has roots 1 and —1 in any 
commutative ring, 1 and —1 are the only roots in any integral domain (verify), and 
it has four roots 1,3,5, and 7 in Zg. If S = Za[z] write 


ros 1s 
r={{" ‘]|r € 2a, € sh and c=(5 ae 
Then one verifies that R is a commutative ring and o? = 1 for any s € S. Hence, 
x? ~ 1 has infinitely many roots in R (because Sis infinite). 


Examples 13 and 14 indicate that not much can be said in general about the 
number of roots of a polynomial over a commutative ring. However, if the ring is 
an integral domain we do have Theorem 8. 


Theorem 8. Let R be an integral domain and let f be a nonzero polynomial in 
R{z] of degree n. Then f has at most n roots in R. 


Proof. Use induction on n = deg f. If n =0, then f is a nonzero constant and has 
no roots. Ifn=1, say f = a9 + aye where a; #0, let a an b be roots of f. Then 
ag + aya = 0 = ap + a4), so a; (b — a) = 0. Hence, b = a because R is a domain. 
Suppose n> 1. If f has no root in R, we are done. If f(a) = 0, with ae R, 
then f = (x — a)gq by the factor theorem, and deg g = n — 1 by Theorem 2. Suppose 
that b # a is another root of f. Then 0 = f(b) = (b—a)gq(b), so g(b) = 0 because 
ft is a domain. But q has at most n — 1 roots in R by induction, so f has at most 
nm — 1 roots distinct from a. Hence f has at most n roots, as required. _| 


4.1. Polynomials 211 
s 


We hasten to note that a polynomial of degree n over an integral domainsR need 
not have any roots in R (for example x? + 1 where R = Z). The force of Theorem 
8 is to place a maximum on the number of roots. Also, it is important that R is 
commutative: x? + 1 has roots +i, +7, and +k in the division ring H of quaternions. 
In factoring a polynomial such as f = 62? — 7x + 2 = (2x —1)(3z — 2) in Z[a}, 
it is important to be able to find the rational roots of f, that is the roots in Q. 
Theorem 9 reduces this task to checking a finite number of potential roots. 


Theorem 9. Rational Roots Theorem. Let f = a9 +.a,2 + agv? +--+ +anx™ 
be a polynomial in Z|x], where ap # 0 and a, #0. Then every rational root of f 
has the form § where clap and dian. 


Proof. If § € Q is a root of f we may assume that it is in lowest terms; that is, 
gcd(c, d) = 1. We have 0 = f(§) =a9 +a1$ +++++ On 5. Multiplying by d” gives 


0 = agd™ + aycd™"1 +---- + an_ic™ 1d + a,c". 


Then clagd” because c appears in each term but the first. But gced(c,d") = 1 (any 
prime dividing c and d” would divide c and d), so Theorem 5 §1.2 gives clap. A 
similar argument shows that dlan. 


Corollary. The only rational roots of a monic polynomial in Z{«] are integers. 


Example 15. Let m be a positive integer that is not the square of an integer. Show 
that ./m is not in Q. 
c 


Solution. If ,/m were in Q, it would be a rational root of x? — m. If q= § is such 
a root (in lowest terms), Theorem 9 shows that c|m and d|l. Hence d= +1, so 
q= +c € Z. Neither of these are roots of x” — m by hypothesis. O 


2 


Example 16. Factor f = 32° — x? — x — 4 as far as possible in Q[z]. 


Solution. If § is a rational root of f, then c|(—4) and d|3 by Theorem 9. Hence 
c= 21, £2, £4 and d=+41, 43; so $=41, 42, +4, +4, +2, +$. Exhaustive 
checking gives $ as a root, SoZ — 4 is a factor in Q[z]. Hence 32 — 4 is a factor in 
Q|z] and the division algorithm gives 
f = (8a — 4)(2? + x +1) in Qi]. 

But 2? +2+1 has no rational roots by Theorem 9 (the possibilities are +1), so 
x? +2+1 has no factorization in Q(z] (any factors would be linear and so produce 
rational roots). Hence f factors no further in Q[z]. O 


Note that we are not done factoring in Example 16 if we allow the factors to have 
coefficients in C (because 2? + 2 + 1 has roots $[—1 + V3i] in C by the quadratic 
formula). Note also that f actually factored in Z[x], even though it has no root in 
Z. We examine these observations further in Section 4.2. 


Exercises 4.1 
Throughout these exercises R is a ring unless otherwise specified. 


1. In each case compute f +g and fg. 
(a) f=34+22+27+42', g=1+27? +2 in Z;[z] 
(b) f=5+2e+a?+2°,g=24+2+4+2? in Z,[z] 


212 


13. 
14. 


15. 
16. 


17. 


18. 


19. 


20. 


4. Polynomials 


. (a) Compute (14 2)> in Zs[z]. 


(b) Compute (1+ 2)" in Z,[z}. 
(c) Show that (1+)? =1+ 2? in Z,[z], if p is a prime. [Hint: Lemma 1 §3.4.] 


. (a) How many polynomials of degree 3 are there in Zs[z]? 


(b) How many monic polynomials of degree 3 are there in Zs{z]? 
(a) Find all roots of (a — 4)(x — 5) in Ze; in Zr. 
(b) Find all roots of 2° — x in Zg; in Za. 


. (a) Find the number of roots of z? — x in Z4; Zp x Zo; any integral domain; Zg. 


(b) Find a commutative ring in which 2? —« has infinitely many roots. [Hint: 
Exercise 29 §3.1; or consider R= F x F x Fx --- where F is a field|| 


. Assume that f,g, and f +g are all nonzero in Riz]. 


(a) Show that deg(f +g) < max{deg f, deg g} for any ring R. 

(b) Provide an example of where equality fails to hold in (a). 

(a) Let f and g be nonzero polynomials in R[x] and assume that the leading coeffi- 
cient of one of them is a unit. Show that fg #0 and that deg(fg) = deg f + degg. 
(b) If R is not a domain, show that linear polynomials f and g exist in R[x] such 
that deg(fg) < deg f + degg. 


. Let R be a subring of S, let f #0 and g be polynomials in R[z], and assume that 


the leading coefficient of f is a unit in R. If f divides g in S[x], show that f divides 
g in Ris]. (Hint: Division algorithm.] 


. Show that R[z] and R have the same characteristic for any ring R. 
. Where is the fact that a € Z(R) used in the proof of Theorem 5? 
. If @ is a nonzero root of f=ag+ayr+-::+an_12" + +a,2", show that a7 


1 


(if it exists) is a root of g=Gn+@ni10+++:+a,;0"1+ aor". Assume that RF is 
commutative. 


. If a,b, and c are real and f = xz? —(a+c)z+(ac— 0”), show that every complex 


root of f is real. 

Divide 23 — 4a +5 by 2x +1 in Q[z]. Why is it impossible in Z{z]? 

In each case write g = qf +r in Riz], where r = 0 or degr < deg f. 

(a) g=a'44e4++o% +52? +242, f=2?+c+1, R=Ze 

(b) g=2°4+2r4 +a? +044, f=2?+242,R=Z, 

(c) g=23+n7+30+2, f=3c+1, R=Zs 

(d) g=a2° 422? +24+3, f=30+2,R=Z, 

(e) g = 32° + 2a? —~8a+1, f=2?+2,R=Q 

(f) g = 30% +52? +2+6, f=227+1,R=Q 

Which of s —1,2+1, and x — 2 is a factor of a* — 22° — x? + 3a — 2 in Z[a]? 

(a) For which primes p is « — 1 a factor of f = 324 + 523 + 22? +244 in Z[a]? 
(b) For which primes p is x + 2 a factor of f = 524 — 22° + 32? + 4x — 1 in Z,[2]? 
In each case factor f into linear factors in F(z]. 

(a) f=at+12, F=Zi3 (b) f=2° +1, F=Z, 

(c) f=e8—a2?+e2—-1,F=Z, (d) f=e'4+4e? 43245, F=Z, 

Let a #0 in a field F. Determine the integers n > 1 such that x+ a is a factor of 
x” +a” in F'[s]? In these cases write down the factorization. 

If F is a field, let u,v, and w be distinct roots of f = #° + ax? + bx +c in F[z]. Show 
that a= —(utvut+w), b=uv+uw-+ vu, and c= —uvw. 

Show that the factor theorem is false in R[z] if R is a noncommutative division ring. 
|Hint: Consider f = br — ba where ab ba in R.] 


21. 


22, 


23. 


24, 


25. 


26, 
27. 


28. 


29. 


30. 


31. 


32. 


33. 


34, 


4.1. Polynomials 213 


(a) Show that Z,[z] has infinitely many units and infinitely many nilpotents. 

(b) Find a polynomial in Z4[z] that is neither a unit nor a nilpotent. 

If R is a commutative ring, show that the only idempotents in R[z] are in R. [Hint: 
If a = 2ae, e? = e, show that a = 0.| 

In each case determine the multiplicity of a as a root of f. — 

(a) f=a? — In? —4e+3;4=3,R=Ze¢ 

(b) f=a*4+ 20? +4+274+2;a=-1,R=Zz 

(c) f=a°4 204 +03 —2? 4+27-1,a=1,R=—Z4 

(d) f = 4a* —- 825 +2? —-324+9;a=3,R=Q 

If R is a commutative ring, a polynomial f in R[z] is said to annihilate R if f(a) = 0 
for every a € R. 

(a) Show that x? — zx annihilates Z,. [Hint: Fermat’s theorem.] 

(b) Show that x° — x annihilates Zo. 

(c) If p $ 2 is a prime, show that x? — x annihilates Zo,. [Hint: Corollary 1, Theorem 
8, §3.4.] 

(d) If p > 3 is a prime, show that x? — x annihilates Z3,. [Hint: As in (c).] 

(e) Does 2° — a or x” — x annihilate Z35? Justify your answer. 

(f) Show that a polynomial of degree n exists in Z,,[z] that annihilates Z,. 

In each case find all rational roots of f and factor f as far as possible in Q{z}. 

(a) f =4c4 403 — 307+ 42 —3 

(b) f = 4a* + 4a° + 32? -a2—-1 

(c) f=at—ai—2?-2-2 

(d) f=a2i—a2t+25—-a2%*+2-1 

(e) f=act+a> +327 +2742 

(f) f = at — 809 + 82?- 3243 

Show that %/m is not rational unless m = k” for some integer k. 

If f is a monic polynomial in Z{z], show that the only rational roots (if any) are 
integers. 

if R is an integral domain and f € Riz] has infinitely many roots in R, show that 
f = 0 is the zero polynomial. 

Let f and g be polynomials in R(x], where R is an integral domain, and assume 
that each is either 0 or has degree at most n. If f(a) = g(a) holds for n+ 1 distinct 
elements a € R, show that f = g. 

Show that 2? — x = x(x —1)(a% — 2)---(w —p+1) in Z,[z], where p is a prime. [Hint: 
Exercise 24(a) and Theorem 8.] 

Show that (x) is a maximal ideal in R[z] if R is a field. What can be said if R is an 
integral domain? 

Let R be any ring and let A denote the set of all polynomials in R[z] whose 
coefficients sum to 0. Show that A is an ideal of R[{z] and R{x]/A = R as rings. 
Define 6: R[x] — R by taking @(f) to be 0 or the leading coefficient of f, depending 
on whether f = 0 or f #0. Is 6 aring homomorphism? Justify your answer. 

Let R be a commutative ring and let y, : R[x] > FR be evaluation at a € R. 

(a) Show that ya(r) =r for allr eR. 

(b) If 0: R{z] > R is any ring homomorphism such that 0(r) =r for all re R, 
show that 9 = y, for some a € R. 

(c) Find a nonzero ring homomorphism C[a] — C that is not an evaluation. 

(d) Is at y, a ring homomorphism R — F(R[z],R)? Here F(R{2], R) is the ring 
of functions R[x] > R with polnewiee operations. (See Example 4 §3.1.) 


214 4. Polynomials 


35 


36. 


37. 


38. 


39. 


40. 


41. 


. If A is an ideal of R, let Alz] = {a9 +riz+rez?+--+|ao € A, 7: € R}. Show that 
Alz] is an ideal of R[x] and R{x]/A[x] = R/A. 

Show that Z, can be embedded in an infinite field. [Hint: Theorem 2 and Theorem 
5, §3.2.] 

Let r++ denote a ring homomorphism 6: R—S. If f=anp+ayo+-+--+ay,2” in 
Ric}, let f = Go +Giz+-+-+G,2" in S[z]. Prove each of the following statements: 
(a) 8: R[x] > Sz] with 8(f) = f is a ring homomorphism, onto if @ is onto. 

(b) If ker @ = A, then ker @ = Az]. 

(c) If R&S, then R[x] = S{x]. 

(d) If A is an ideal of R, then R[x]/A[z] = (R/A)[z]. 

(e) If f has no root in S, show that f has no root in R. 

Let R be a commutative ring. Use the notation of Exercise 37. 

(a) If P is a prime ideal of R, show that P[z] is a prime ideal of R[z]. 

(b) If M is a maximal ideal of R, show that M[z] is not a maximal ideal of R{z]. 
Let R be a commutative ring and consider f = ag +a,0++-:+4n2” in R[z]. If ao 
is a unit in R and a; is nilpotent for all i > 1, show that f is a unit in R[s]. [Hint 
If u is a unit and a is nilpotent then u+a is a unit. 

If R is commutative and f € R[x], denote the corresponding polynomial function 
by f:R—R. Thus, f(a)= f(a) for every a€ R. Let F(R,R) denote the set 
of all functions R— R using pointwise operations (see Example 4 §3.1). Define 
6: R(x] + F(R, R) by O(f) = f. 

(a) Show that @ is a ring homomorphism and hence that the set P(R, R) = 0(R[z]) 
of all polynomial functions R —> R is a subring of F(R, R). 

(b) Show that ker6 = {f € R[x] | f(a) =0 for all a€ R}. These polynomials are 
said to annihilate R (see Exercise 24). 

(c) If R is an infinite integral domain, show that R[c] = P(R, R). 

(d) If R is a finite ring, can R[x] = P(R, R)? Give reasons. 


Lagranges Interpolation Expansion. Let F be a field and let ao,a1,a2, 
...)@n, be distinct elements of F’, n > 1. Define the Lagrange polynomials 
I] (2 - a) 
i¢k 
Ch = ; k=0,1,...,n, 
II (ax —a:) 
i¢tk 


where the numerator is the product (w-a9)(w—a1)-+--(t—a@n) with (c—a;,) omitted, 
and the denominator is similar. If f = 0 or deg f <n in F[z], show that 

f = f(@o)co + f(ax)er + +++ + f(an)}en. 
[Hint: Exercise 29] 


4.2 FACTORIZATION OF POLYNOMIALS OVER A FIELD 


The prime factorization theorem (Theorem 7 §1.2) asserts that every integer n > 2 
can be written uniquely as a product of primes. It may come as a surprise that, if F 
is a field, an analogous factorization theorem holds in the polynomial ring F'[z]. We 
devote this section to proving this theorem and to discussing several other results 
which arise along the way. 


The prime integers can be described as follows: An integer p > 2 is a prime if 


p= ab, where a,b € Z, implies that either a = +1 or b= +1. In other words, p 


4.2. Factorization of Polynomials over a Field 215 


admits only trivial factorizations where one of the factors is a unit in Z. If F is 
a field, the units of the integral domain F'[z] are just the nonzero elements of F 
(Theorem 2 §4.1). If a#0 in F, each polynomial f in F[z] certainly admits the 
trivial factorization f = a(a™} f). If we rule out such factorizations, we arrive at the 
following analogue for F[a] of the definition of primes in Z. 

If F is a field, a polynomial p ¥ 0 in F[z] is called®® an irreducible polynomial 
(and we say that p is irreducible over F)) if: 


(1) degp > 1. 
(2) Ifp = fg in F [a], then either deg f = 0 or degg = 0. 


Polynomials that are not irreducible are called reducible. 

Note that deg f = 0 if and only if f is a nonzero constant in F[z], that is f is 
a unit in Fz] by Theorem 2 §4.1. Hence, condition (1) ensures that no irreducible 
polynomial is a unit (the analogous condition on Z is that p > 2 for all primes p). 


Example 1. Show that every linear polynomial is irreducible over any field F. 


Solution. Let p = az + 6, a $ 0, be linear, and suppose that p = fg in F'[z]. As F is 
a domain, Theorem 2 §4.1 gives deg f + deg g = degp = 1. As both deg f > 0 and 
deg g > 0 are integers, one of them must equal 0. Oo 


Example 2. If F is a field and p is irreducible in F[z], show that ap is also 
irreducible for all a #0 in F 


Solution. If ap = fg in F[z], then p = (a1 f)g. Because p is irreducible, it follows 
that either deg g = 0 or deg(a~'f) = 0. As deg f = deg(a“'f), ap is irreducible. 0 


If f is any irreducible polynomial in F'[z], with leading coefficient a, we have 
f =ap, where p=a'f is irreducible by Example 2 and also monic (leading 
coefficient 1). Thus, there is no great loss in generality in working with monic 
irreducible polynomials. 

Every linear polynomial in Fz] has the form p = az + b, a #0, and so has a 
root —a~+b in F. However, no irreducible polynomial of degree 2 or more can have 
a root in F. 


Theorem 1. Let F be a field and consider p in F(x] where deg p > 2. 
(1) Ifp is irreducible, then p has no root in F. _ 
(2) Ifdegp is 2 or 3 then p is irreducible if and only if it has no root in F. 


Proof. (1) If p has a root a € F, then p(a) = 0, so p = (a — a)q by the factor the- 
orem. Because deg p > 2, this means p is not irreducible, contrary to hypothesis. 
Hence p has no root in F. 

(2) Assume that p has no root in F. If p= fg then f and g have no root in 
F, so deg f #1 and degg #1. But deg f + deg g = degp is 2 or 3, so deg f = 0 or 
deg g = 0. Hence p is irreducible. ines converse follows from (1). | 


58They are not called prime polynomials, that term is reserved for the stronger property: If 
p\fg then pl|f or plg. This will be fully investigated in Section 5.1. 


216 4. Polynomials 


Example 3. x? +1 is irreducible in R{z] because it has no root in R. 
Example 4. Determine if p = x° + 3x? + x + 2 is irreducible over Zs. 


Solution. Because Zs = {0,1,2,3,4}, we can compute p(0) = 2, p(1) = 2, p(2) = 4, 
p(3) = 4, and p(4) = 3. Hence p has no root in Zs and so is irreducible. CJ 


Example 5. Show that x? +2 +1 is the only irreducible quadratic over Zo. 


Solution. Because Zz = {0,1}, every quadratic in Z2[z] has the form x? + az +, 
where a and b lie in Z. Hence there are four possibilities: 22, 2? +2, 2? +1, and 
x22 +2+1. The first three have a root in Zo, whereas z* + 2+ 1 does not. Hence 
x? +¢+1 is the only irreducible quadratic in Zo[z]. 0 


Part (2) of Theorem 1 provides a useful test of irreducibility for polynomials of 
degree 2 or 3, but it fails for polynomials of degree 4 or more. For example, 


p= a+ 3a? +2 = (2? + 1)(x? + 2) 
is clearly not irreducible in R{z], but it has no root in R. 
Example 6. Show that p = x? — 2 is irreducible over Q but not over R. 


Solution. p = (a — /2)(a + V2) in R[z], so evidently it is not irreducible over R. But 
this expression shows that the only roots of p in R are V2 and —/2, and neither is 
in Q. Hence p is irreducible over Q by Theorem 1. O 


Observe that Example 6 shows that the phrase “p is irreducible” is meaningless 
unless we specify which field is to be used for the coefficients. 


Example 7. If p = 3 (mod 4), p a prime, show that x? + 1 is irreducible over Zp. 


Solution. If a #0 in Zp, Fermat’s theorem (Theorem 8 §1.3) gives a?-1 = 1. If 
p=4k+3,k> 0, then 1 = a?-} = at*+? = (a?)?*+1, Hence, a? = —1 is impossible 
in Zp, so 2? — 1 has no root in Zp. Oo 


Note that the converse of Example 7 is also true (Exercise 17); that is, 2? +1 is 
irreducible over Z, (p a prime) if and only if p = 3 (mod 4). 

If F is a field, the irreducible polynomials are the analogues in F'[z] of the primes 
in Z. Since there is no way to systematically write down all the integral primes, 
any characterization of all the irreducible polynomials in F[x] would seem difficult 
(which, in fact, is the case). However, an explicit description does exist in the case 
of F = Cor F =R. This depends on a deep theorem first proved in 1799 by Gauss. 


Theorem 2. Fundamental Theorem of Algebra. If f € C[{x] is a nonconstant 
polynomial, then f has a root in C. 


This result is sometimes express by saying that the complex field is algebraically 
closed, and we have more to say about that in Chapter 6. No simple proofs of the 
theorem are known, and most proofs involve analysis at some stage. We give one 
proof in Section 6.6. 


Theorem 3. As usual, let C denote the field of complex numbers. 
(1) Ifdegf=n>1, f € C[z], then f factors completely as 


f =u(x — u1)(@ — ue) ++ (4% — Un) 


4.2. Factorization of Polynomials over a Field 217 


where u is the leading coefficient of f and ui, uo,...,Un are the (not 
necessarily distinct) roots of f in C. 


(2) The only irreducible polynomials in C[{z] are linear. 


Proof. (2) is clear by (1). To prove (1) induct on n=degf. If n=1, then 
f=uc+b=u(e2t+u7'd), so u=—u tb. If n>1, then f has a root uw by 
Theorem 2, so f = (a — u1)q where degq =n — 1. Then g = u(x — ug) +++ (2 — Un) 
by induction, so f = u(x — u1)(w% — ug) +++ (@ — un) has the desired form. Clearly, u 
is the leading coefficient of f, and w1,U2,..., tn are the roots. @ 


The fundamental theorem shows the existence of roots of complex polynomials 
but reveals no way to find them. This is difficult in general. Even so, the theorem 
has many applications, as illustrated by the analysis it provides of real polynomials. 

Let q = ax” + br +c, a# 0, be a real quadratic. If u is a root of gq in C, we have 
au? + bu +c=0. We solve for u by using the famous quadratic formula: 


w= 5 [b+ Vo — dae). 


The quantity b? — 4ac is called the discriminant of g. If g is irreducible, it has no 
real roots so b? < 4ac and the two nonreal complex roots are conjugates: 
1 1 
eae [5 + iv/4ac — | and “= 5a [-b - iv/4ac— 8] : 
a 
The converse is also true: If u is any nonreal complex number, then u and @ are 
the roots of an irreducible real quadratic. In fact, the (monic) quadratic 
xz? —(u+t)e+ut = (x —u)(x — 2) 
has real coefficients ut = |u|? and u+@ = 2 reu, and so is irreducible over R be- 


cause its roots u and &@ are not real. 


Theorem 4. Every nonconstant polynomial f in R[z] factors as 
f =a(x—1r1)(@—72)+++(@—Tm) G1 92°° +O: 


where a is the leading coefficient of f;T1,1T2,...,%m are the real roots of f (if any); 
and q1,q2,.--,Q are monic irreducible real quadratics (perhaps none). 


Proof. Write f = ap + a,x + agz? +--+ + nx”, where the coefficients a; are real. If 
u is a complex root of f, we claim first that the conjugate tw is also a root. Indeed, 


f(u) = 0, so 


0=0=f(u) 


i 


Gp + Gyti + Got? +--+ +G,u" 

dp + art + apt? +--+ + anu” 

f(a), 

where &; = a; for each i because a; is real. Thus, the nonreal roots of f (if any) 
come in conjugate pairs u and w. Hence, f factors in C[z] as 


f= a(2—1y)(@— 12) ++(@ —7m)(@ — 14) (@ — ty) + (@ — uy) — He) 


by Theorem 3, where a =a, is the leading coefficient of f; 71,...,7m are the 
real roots; and u1,%1,...;Um, Um are the pairs of nonreal roots. This proves the 


218 4. Polynomials 
theorem because each product 
qj = (@ ~ uj) (a — Hy) = 2? — (uy + O)o+ Uji; 
is an irreducible real quadratic (see the discussion preceding this theorem). El 


As an immediate consequence of Theorem 4, we have 


Corollary. The irreducible polynomials in R{x] are either linear or quadratic. 


Irreducibles Over the Rationals 


Theorems 3 and 4 completely describe the irreducible polynomials in C{z] and R[z]. 
However, the situation in Q[z] is much more complicated. If f € Q|z] and m is the 
least common multiple of the denominators of the coefficients of f, then mf € Z[z]. 
So it is not surprising that many questions about Q(z] come down to questions 
about Z[z]. Theorem 5 is the key to making this transition from Q{z] to Z[z]. 


Theorem 5. Gauss’ Lemma. Let f = gh in Za]. If a prime pe Z divides 
every coefficient of f, then either p divides every coefficient of g or p divides every 
coefficient of h. 


Before giving the proof we introduce an important homomorphism. Given a 
prime p € Z and an integer a, we let @ denote the residue of a in Z,. If we are given 
f =a +412 + agu? +--+ +a,2" in Z[z], the polynomial 


f =Go+G10+ Gon? +---+G,2" in Z,[a] 
is called the reduction of f modulo p. The point is that f+ f is an onto ring 
homomorphism Z[z] — Zp[a]—see Exercise 37 §4.1. 


Proof of Gauss’ Lemma. Let f = gh in Z{x] and suppose that p divides every 
coefficient of f. Then f = 0 in Zp|x] so, as reduction modulo p is a homomorphism, 
we get O= f = 9-h. But Z,[z] is an integral domain (Z, is a field), so this means 
g=0orh=0. Butg=O0in Z,|x| means that every coefficient is zero in Z,, that 
is, every coefficient of g is divisible by p. Gauss’ lemma follows. 


If f is a nonconstant polynomial in Z{x], we say that f = gh in Z[z] is a proper 
factorization if both g and h have smaller degree than f. Then Gauss’ lemma 
yields the following useful theorem. 


Theorem 6. Let f be a nonconstant polynomial in Z[z}. 
(1) If f = gh with g and h in Q[z], then f = goho where go and ho are in Z[z], 
deg go = deg g, and deg hy = deg h. 

(2) f is irreducible in Q{z] if and only if it has no proper factorization in Z[z]. 
Proof. (1) Let a and b be the least common multiples of the denominators of the 
coefficients of g and h, respectively. Then gi =ag and hy=bdh are in Z[z], so 

abf = gihi 


is an equation in Z[a]. If p is a prime dividing ab, then Gauss’ lemma shows that 
either p divides all coefficients of g; or p divides all coefficients of h;. Hence, p can 


4.2, Factorization of Polynomials over a Field 219 


be canceled to give a new equation in Z[z]: 


ab 
— f = gah. 
Dp 4 
Continuing, we delete every prime factor of ab, and finally obtain a factorization 
f = 9h 
in Z[x]. Now (1) follows because each of the polynomials g;, g2,... has the same 


degree as g and, similarly, deg h; = deg h for each i. 
(2) If f is irreducible in Q[z], it has no proper factorization in Q[z] and hence 
none in Z[z]. The converse follows from (1). | 


Incidentally, a polynomial in Z[z] that has no proper factorization in Z[z] is not 
called “irreducible in Z[z]”. The reason is that, in the general factorization theory 
to be developed in Chapter 5, polynomials such as 54 — 5 = 5(a# — 1) are not called 
irreducible in Z[a] even though they admit no proper factorization in Z[z]. 

Given f in Q{z], there is an integer a # 0 such that af is in Z[x] (any common 
multiple a of the denominators of the nonzero coefficients of f will do). By Example 
2, f is irreducible in Q[z] if and only if af is irreducible in Q(z]. Hence, Theorem 
6 reduces the problem of testing whether f is irreducible in Q(z] to the problem 
of showing that af has no proper factorization in Z[z]. So we look at this latter 
problem. The following observation is relevant: 


Lemma 1. If f € Z[z] is monic and f = gh in Z[{z|, we may assume that both g 
and h are monic. 


Proof. Let the leading coefficients of g and h be a and 8, respectively, so the leading 
coefficient of gh is ab. Hence, the fact that f = gh is monic means 1 = ab so, since 
a,b € Z, either a=b=1o0ra=b=-—1. Now Lemma 1 follows. | 


Example 8. Show that f = 2° + 2”? + 1 is irreducible in Q[z]. 


Solution. By Theorem 6 we show that f has no proper factorization in Z[z]. It has 
no linear factors because it has no rational roots (the only candidates are +1 by the 
rational roots theorem). Hence, if f factors at all in Z[z], it factors as a quadratic 
times a cubic. Moreover, the factors may be taken to be monic by Lemma 1 because 
f is monic, say 


f=0' +22? +1= (2? +a04+6)(23 + cx? +dz+e), 


where a,b,c,d, and e are integers. Multiplying the right side out and equating 
coefficients of powers of x gives five equations: 


ate=0, dtac+b=0, e+ad+bc=2, ae+bd=0, and be=1. 


The last equation gives b=e=+1; then the second-to-last equation gives a+d=0. 
Because a + c = 0 too, the second equation becomes —a — a? + b = 0. Hence a is an 
integral root of 2? +2 —b=0, where b = +1. But the only rational candidates are 
+1 by the rational roots theorem, and neither is a root when b is 1 or —1. Hence 
no such factorization of f exists, so f is irreducible in Q{z]. CO 


The heavy-handed method in Example 8 is less effective for polynomials f of 
higher degree, because the resulting systems of equations are complicated. 


220 4. Polynomials 


Therefore, we give two other irreducibility tests for Q|z]. The first utilizes, for 
a prime p, the homomorphism Z{x] — Z,|z] used in the proof of Gauss’ lemma, 
where g ++ g and g comes from reducing coefficients of g modulo p. 


Theorem 7. Modular Irreducibility Test. Let 0 # f € Z[{x] and suppose that 
a prime p exists such that 
(1) p does not divide the leading coefficient of f—for example if f is monic. 
(2) The reduction f of f modulo p is irreducible in Z,[z]. 
Then f is irreducible in Q|z]. 
Proof. First, deg f =deg f by condition (1). Suppose f is not irreducible in 
Qlz], so there is a proper factorization f = gh in Z[x] by Theorem 6. Then, we 
have degg < degg < deg f =degf and, similarly, degh<degf. But f = gh, 
contradicting the irreducibility of f in Zp|z]. | 


Ezample 9. Show that f = x° + 42? + 6x + 2 is irreducible in Q[z]. 


Solution. We could use the rational roots theorem to show that f has no root in Q. 
However, reducing modulo 3 is much easier. Then f = 2° + 2? + 2, which clearly 
has no root in Zz. Hence Theorem 7 applies. O 


Example 10. Show that f = x* + 2x3 + 2x? — a2 +1 is irreducible in Q(z]. 


Solution. Reduction modulo 2 gives f = z4+2a+1 in Za[x]. This polynomial has 
no root in Zp» so, if it fails to be irreducible, it must factor into two quadratics. 
These must be irreducible (they have no root) so, by Example 5, both must equal 
ve? +o+1. But (a? +041)? =24+20?+1 + f. Hence f is irreducible in Z2[z], so 
f is irreducible in Q|z] by Theorem 7. O 


Note that the converse of the modular irreducibility test fails. In fact 2+ +1 is 


irreducible in Q[z] but not in Z,|[x] for any prime p (see Example 4 §6.4). 
The next test for Q-irreducibility is due to F.G. Eisenstein, a pupil of Gauss. 


Theorem 8. Eisenstein Criterion. Consider f = a9 + ayz +--+ + nz” in Z[z], 
where n > 1 and an # 0. Suppose that a prime p € Z exists such that 

(1) p divides each of ao, @1,..., @n-1. 

(2) p does not divide an. 

(3) p? does not divide ao. 
Then f is irreducible in Q(z}. 
Proof. If it is not irreducible, let f = gh be a proper factorization in Z{zx] (by 
Theorem 6). Write 


g=bo tha +--+ +b n2™ and h=cegoteat-:-+ eae". 
Because p divides ap = boco and p? does not divide ao, it follows that p divides 
exactly one of bo or co, say p divides bp but not co. Then p does not divide b,, (by (2), 
aS Gn = bmcz), so let by, be the first integer in the list bo,b1,...,b6m not divisible 


by p. Equating coefficients of «* in f = gh gives 


Gy = beco + be_icy + +++ + boc. 


4.2. Factorization of Polynomials over a Field 221 


Now p divides a, (by (1), because k < m < n), and p divides every term in the sum 
after the first (by the choice of b,). Hence p divides bycg too, so it divides one of 
by and co. This contradiction proves the theorem. | 


Example 11. Show that 2z° + 27z3 — 182 + 12 is irreducible in Q[z]. 
Solution. The Eisenstein criterion applies with p = 3. O 


Example 12. Q|z] contains an irreducible polynomial of every positive degree. In 
fact 2” — 2 is irreducible in Q[z] for any n > 1 by the Eisenstein criterion. O 


Example 13. If p is a prime, show that the pth cyclotomic polynomial 


®,= cP 14 oP 2 4...4 e741 
is irreducible in Q{z]. 


Solution. Replacing x by x +1, it suffices to show that ®,(x + 1) is irreducible. {In- 
deed, if 6, = gh is a proper factorization, then the same is true of the factorization 
®,(2 +1) = g(x + 1)A(x + 1).] Now observe that 


(u —1)6, = (c—1)(a? 14+ oP? 4..-4+ 241) =n? -1. 
Replacing x by x + 1 gives x ®,(x + 1) = (a + 1)? — 1 so, by the binomial theorem, 
O(c +1) =a?) + (far? +t (p22) cee 


But p divides () forl <k <p—2by Lemma 1 §3.4. Hence, the Eisenstein criterion 
applies, showing that ®,(x + 1) is irreducible. 0 


Unique Factorization 


Theorems 3 and 4 show that any polynomial in C[z] or R[z] is a constant times a 
product of (monic) irreducible factors. We conclude this section with a proof that 
this is true in F[a] for any field F, and that the resulting factorization is unique. 

One comment on uniqueness is in order. The prime factorization theorem for 
Z asserts that every integer n > 2 is uniquely a product of primes. However, the 
uniqueness requires the assumption that primes are positive. Hence every integer 
apart from 0,1, and —1 factors uniquely as a unit +1 times a product of primes. 
The exceptions are 0 and the units +1 of Z. If F is a field, the units in F'[z] are the 
nonzero constant polynomials, so the analogue for F[z] of the prime factorization 
theorem would be: Every nonconstant polynomial in F'[z] factors uniquely as a unit 
u+#0 in F times a product of irreducible polynomials. But because of the trivial 
factorization f = a(a~!f), uniqueness here requires that the irreducible polynomials 
be monic (this is analogous to insisting that primes in Z are positive). The reason 
this works is Theorem 9. 

Recall that, if R is commutative ring, we say that a polynomial dé R[z] is a 
divisor of f € R[z], or that d divides f, if f = gd for some q € Fz]. 
Theorem 9. Let F be a field and let f and g be nonzero monic polynomials in 
F [a], each of which divides the other. Then f = g. 


Proof. If f = qg and g = pf in R[x] then eliminating g gives f = gpf. Hence 1 = gp 
(F[z] is a domain) so g is a constant in F. Since f = qg and f and g are monic, 
comparing leading coefficients gives g = 1. Hence f = g. a 


222 4. Polynomials 


The proof of the following result is left to the reader (Exercise 40). 


Corollary. If F is a field and p € F[z]| is monic, the following are equivalent: 
(1) p is irreducible. 
(2) If d is a monic divisor of p then either d= 1 or d= p. 


With these results, the factorization theory for F{x] closely parallels that for Z. 
Therefore, we skip many details and merely sketch the proofs of the results. The 
first item on the agenda is the notion of greatest common divisor. 


Theorem 10. Let f and g be nonzero polynomials in F[x], where F is a field. 
Then a uniquely determined polynomial d exists in F[x| satisfying the following 
conditions. 

(1) d is monic. 

(2) d divides both f and g. 

(3) If h divides both f and g, then h divides d. 

(4) d=uf +g for some polynomials u and v in Fz]. 
Finally, d is the unique polynomial satisfying (1), (2), and (3). 


Proof. Consider X = {uf +vg|u,v in F[z]}. This set contains nonzero polyno- 
mials (for example f*) and thus contains monic polynomials. Among all the monic 
polynomials in X, let d= uf + ug be one of smallest degree. Then (1) and (4) are 
satisfied, and (3) is an easy consequence of (4). By the division algorithm write 
f =qd+r, where r = 0 or degr < degd. Then 


r=f—qd=f—q(uft+vg)=(1—qu)f — (qu)g. 


If r #0 and a is the leading coefficient of r, this expression shows that a~'r is a 
monic member of X of smaller degree than d. This result contradicts the choice of 
d, so r = 0 and d divides f. Similarly, d divides g, proving (2). 

Finally, if d, is another polynomial satisfying (1), (2), and (3), then d and dy 
each divides the other. Hence d=d, by the Corollary to Theorem 9, proving 
uniqueness. | 


As in Z, the polynomial d in Theorem 10 is called the greatest common divisor 
of f and g in F{a], denoted gcd(f,g), and f and g are called relatively prime in 
F{a] if d= 1. Note that 1 is the unique monic polynomial of degree 0, and that 
Theorem 10 allows the possibility that d= 1. 


Example 14. Find the greatest common divisor d of 2? ~1 and 2x +1 in Q{z]. 


Solution. Because d divides the irreducible polynomial 2x +1, either d=1 or 
d=a2+ 5. But « +5 does not divide x? —1, so d=1. Moreover, the division 
algorithm gives 


a? —1= [4(2x —1)] (Qe +1) - 2. 


This implies that 1 = 3 (2a ~ 1)(2x + 1) — $(x? — 1), and so expresses d as a linear 
combination of 2? — 1 and 2x2 +1. 0 


In general, if d = gcd(f, g) the analogue of the euclidean algorithm (Section 1.2) 
expresses d as a linear combination of f and g. Example 15 provides an illustration. 


4.2. Factorization of Polynomials over a Field 223 
Example 15. Find the gcd of f = 24 — 2? + 2-—landg=23 —2?+2-1in Q[z] 
and express it as a linear combination of these polynomials. 
Solution. We use the division algorithm repeatedly to obtain 
f = (a+1)g+(-2* +2) 
g = (-2)(—2? + x) +(e - 1) 
—a? +2 = (—x)(x —1)4+0. 


As in Z, the last nonzero remainder d = x — 1 is the gcd. (In this case it happens to 
be monic; in general, if the leading coefficient is a then d is obtained by multiplying 
by a~}.) Eliminating remainders gives the required linear combination: 


e-l=g+a(-2? +2) 


g+alf — (+ 1)g] 
af —(a?+2-I1)g. Oo 


Theorem 11 is the analogue for polynomials of Euclid’s lemma for integers. 


Theorem 11. Let p¢ Flz] be irreducible, F a field. If p divides a product 
fife+++ fn of nonzero polynomials in F'|x], then p divides one of the f,. 


Proof. By induction on n, it suffices to do the case n= 2. If p divides fg, let 
d = gcd(f,p). Then d divides p so, as p is irreducible, either deg d = 0 (so d = 1) or 
deg d = deg p. In the second case, p = ad, a € F, so p divides f (because d divides 
f), and we are finished. If d= 1, Theorem 10 gives 1 = up + vf, where u,v € F [a]. 
Hence g = ugp + vfg, so p divides g (because p divides fg). Hence, the theorem 
holds in this case too. ] 


Theorem 12. Unique Factorization Theorem. If F is a field, let f be a 
nonconstant polynomial in F[z]. Then 


(1) f =ap pe...Pm, where a € F and p; is monic and irreducible for all +. 
(2) The factorization in (1) is unique except for the order of the factors. 


Proof. (1) It suffices to write f as a product f = qi9q2::-dm, where each q; is 
irreducible. (If a; is the leading coefficient of q; for each 7, take p; = a; 1a, and 
@ = @102°+* Gm.) Proceed by strong induction on n = deg f. If n = 1, then f itself 
is irreducible. If n > 1, then either f is irreducible (and we’re done) or f = gh, 
where 0 < degg < n and 0 < degh < n. In this case, both g and h are products of 
irreducible polynomials by induction. 

(2) If it is not unique, let f be a nonconstant polynomial of minimal degree that 
admits two such factorizations: 


f = apip2 +++ Pm = baig2-+- dk. 


Then a= b because each is the leading coefficient of f. Now Theorem 11 asserts 
that p, divides one of the gj, say p; divides qi. Because deg p; # 0, this implies 
that deg p; = deg qi and hence that q; = cp, c € F. But q and pj; are monic, so 
c= 1 and py = q. Canceling gives another polynomial p2--+ pm = go+++ dx of lower 
degree than f that has two such factorizations. This result contradicts the choice 
of f. 


224 4. Polynomials 


Example 16. Factor f = x* —1 into irreducibles in C[z], R[z], Q[z], Zs(x], and 
Zz]. 


Solution. We have f = (x — 1)(z* + x +1) over any field. Now p= 2? +2+1 has 
no root in R, Q, or Zs, so the factorization is f = (x — 1)p in these cases. However, 
p=(z—u)(t—@) in C[a], where u=4(-1+iv8), and p=(x—2)(@—4) in 
Z7\|x|. Thus f factors completely into linear factors over C and Zr. O 


Carl Friedrich Gauss (1777-1855) There is fittle doubt that Gauss ranks with 
Archimedes and Newton as one of the greatest mathematicians of all time. He be- 
gan as a child prodigy and became possibly the last mathematician to know everything 
in his subject. By the time he was 20 he had, among other things, shown that a polygon 
of 17 sides was constructible with compass and straightedge (a problem unsolved since 
the time of the ancient Greeks), discovered the method of least squares (10 years before 
Legendre), proved that every positive integer is the sum of three triangular numbers 
(of the form $n(n+1)), and proved the law of quadratic reciprocity (a feat that had 
eluded Euler). At the age of 22 he completed his Ph.D. dissertation under Pfaff at the 
University of Helmsted by giving the first rigorous proof of the fundamental theorem 
of algebra. In 1801, he published a timeless masterpiece, Disquisitiones Arithmeticae, 
on number theory in which he introduced the idea of congruence and which made him 
famous at the age of 24. 


Gauss was also gifted in areas other than mathematics. He was very good at languages, 
and before he was 19 he had seriously considered philology as a profession. (At age 62 
he started learning Russian, and in two years was completely literate.) He also had other 
scientific interests. His discovery of the method of least squares led him to the bell- 
shaped normal curve in statistics, now called the gaussian distribution. His interests in 
physics were both theoretical and experimental. He did fundamental work in the theory 
of electromagnetism (the unit of magnetic intensity is called the Gauss) and, among 
other things, he invented the electric telegraph with Wilhelm Weber. 


Indeed, he is regarded as one of the great physicists. Moreover, astronomers also consider 
him as one of their own. He spent nearly 40 years as director of the observatory at 
G6ttingen, and when Ceres was discovered and then lost to view, Gauss applied his 
prodigious computational skill to compute the orbit from the limited data available. The 
methods he devised are still in use, and Ceres was “rediscovered” precisely where Gauss 
predicted. 


The motto on Gauss's seal was pauca sed matura—few but ripe. He lived by this dictum 
in the sense that he refused to publish any work until he had perfected it. “A cathedral 
is not a cathedral” he said, “until the last scaffolding is down and out of sight.” This 
led him to withhold publication of several major discoveries because he had not had 
time to polish them. He wrote them instead in his diary, which ultimately contained 46 
cryptic statements of results in 19 pages. The diary was misplaced after his death but 
reappeared in 1898 and was published (by Felix Klein) in 1901, 46 years after Gauss 
died. Although not all his results were recorded in the diary (many were set down only in 
letters to friends), several entries would have each given fame to their author if published. 
Gauss knew about the quaternions before Hamilton; he invented noneuclidean geometry 
before Bolyai and Lobachevski; he studied elliptic functions before Abel and Legendre; 
and, before Cauchy, he had defined analytic functions of a complex variable and proved 
what is now called the Cauchy integral theorem. 


Gauss disliked teaching and preferred his job at the observatory to a professorship. He 
usually rejected aspiring young mathematicians who approached him; but the students 


4.2. Factorization of Polynomials over a Field 225 


that he did accept included Eisenstein, Riemann, Kummer, Dirichlet, and Dedekind. 
His mathematical interests knew no bounds, and many of his achievements have not 
been mentioned here (his fundamental work in differential geometry, for example, or his 
apparent possession of the prime number theorem). It is no wonder he is called “the 
prince of mathematicians.” It is nearly 160 years since his death, but, as E. T. Bell has 
said, “he lives everywhere in mathematics.” 


Exercises 4.2 


1. 


(a) If a #0 in a field F, show that a divides f for every f in F[z]. 
(b) If p divides f for every f in F[z], show that p=a#40,a€F. 


. If f and g are in F[z], F a field, consider the statements: 


(1) f=agforO#aeF; (2) f and g have the same roots in F. 
(a) Show that (1) = (2). (b) Does (2) = (1)? Support your answer. 


. In each case explain why f is not irreducible over any field. 


(a) f = 2° ~ 2x7 + 3a —-2 (b) f=a22+n?+4 

In each case determine whether the polynomial is irreducible. Give reasons. 
(a) oF +5 in Zy[z} (b) x? — 2 in R[z] 

(c) 2? +11 in C[z] (d) 2° — 4 in Zy, [2] 

(e) 28 +2+1 in Zs[z] (f) a? +a2+1 in Zy7[z] 


. In each case determine whether the polynomial is irreducible over each of the fields 


Q,R, C, Zo, Z3, Zs, and Zr. 
(a) 22-8 (b) 2? +241 (c) +241 (d) x? —2 


. Let R be an integral domain and let f € R[x] be monic. If f factors properly in 


Riz], show that it has a proper factorization f = gh, where g and h are both monic. 
Find a monic quartic in R[x] with (1 — 72) and 4 as roots. Is there a cubic? 

(a) If 2? + ax +b has roots u and v in a field F, show that b = uv and a = —(u+v). 
(b) Show that 1+ 4 is a root of 2 + (1 — 2i)¢ — (3+%) € C[z]. Find the other root. 


. Show that an odd degree polynomial in R[z] has a real root. (Requires calculus.) 
. Find all monic irreducible cubics in Za[z]. 
. If f in Ze[z] is irreducible and deg f > 1, show that f has a nonzero constant term 


and has an odd number of terms. Is the converse true? Explain. 


. Let p be a monic quartic in Z,{z]. Show that p is irreducible in Zp{x] if and only if 


(1) p has no root in Zp and (2) p# 2*+ a? +1. [Hint: Example 5.] 


. Show that a monic quintic p in Zo[z] is irreducible if and only if (1) p has no root 


in Z, and (2) p is neither 2° + #4 +1 nor 2° +a +1. [Hint: Exercise 10.] 


. Find all monic irreducible quadratics in Zs[2]. 
. Find a list of six quartics in Z3[z] such that a monic quartic p is irreducible if and 


only if it has no root in Z3 and is not in the list. [Hint: Exercise 14.] 


. Show that there are $p(p — 1) monic irreducible quadratics in Z,[x], where p € Z is 


a prime. [Hint: There are p* monic quadratics; subtract the number that factor.] 


. If p € Z is a prime, prove the converse of Example 7: If p # 3 (mod 4), then x? +1 


is not irreducible over Z,. [Hint: Corollary to Theorem 8 §1.3.] 


. In each case factor f as a product of irreducible polynomials in F'{z]. 


(a) f =324+2, F=Zs 
(b) f =3a4 +2, F=Qyy 
(c) f= 2° 4 22° 4+2c+1, F =Z, 
(d) f=294 22742041, F =Z; 


226 


19. 
20. 
21. 


22. 


23. 


24. 
25. 


26. 


27. 


28. 
29. 


30. 
31. 


32. 
33. 


34. 


35. 


36. 


37. 


38. 


4. Polynomials 


(e) f=at—-2?4+a-1, F=2i3 

(f) f=at—o? +e2—-1, F=Zy7 

Factor 2° + 24 +1 as a product of irreducible polynomials in Zo[z]. 

Factor z° + 2? — z+ 1 as a product of irreducible polynomials in Zs[z]. 

Show that each polynomial is irreducible in Q(z]. 

(a) 323 + 5a? +442 

(b) 543 + 27 + 3 

(c) 2? + 92? +246 

(d) 2? +07+10248 

Show that each polynomial is irreducible in Q[z]. 

(a) 2° + 6a* + 12” + 15 (b) 4x5 + 2824 + 723 — 28a? + 14 

In each case use the method of Example 13 to show that f is irreducible over Q. 

(a) f=at+20-1 

(b) f=at+4r+1 

(c) f =a*+m, where m = 4k — 3, k an integer 

(d) f =x++4mz +1, where m is an integer 

Show that f = 2+ + 42° + 4x? + 4x + 5 is irreducible over Q by considering f(x — 1). 
If p € Z is an odd prime, show that f=1—2+2?—---+ 2?! is irreducible over 
Q. [Hint: Example 13 with c+ x —1.] 

Write fp =a” +a"? 4-.-+a41. 

(a) Factor fa and fg into irreducible polynomials in Q[z]. 

(b) Show that f, is not irreducible if n > 2 is not a prime (see Example 13). 

If p € Z is a prime and m is an integer, show that z? + p*mz + (p — 1) is irreducible 
over Q. (Hint: Example 13.] 

Find a polynomial in Z[z] irreducible over Q but not over Zo, Z3, Zs, and Zr. 

Show that «” —p is irreducible in Q[a] for all n > 2 and all primes p € Z. (Hence 
Q[z] has infinitely many irreducible polynomials of every degree > 2.) 

Show that x? — a is not irreducible in Z,[z] for every a € Zp. 

Let F C K be fields and let f and g be polynomials in F[z}. 

(a) If f is irreducible in K[z], show that it is irreducible in Fz]. 

(b) If f and g are relatively prime in F[z], show that they are relatively prime in 
K [a]. [Hint: Theorem 10(4).] 

Is x* + 1 irreducible over R? Defend your answer. [Hint: Theorem 4.] 

If p € Z is a prime, show that x? + 2 +1 is irreducible over Z, if p = 2,5, 11,17, and 
not irreducible if p = 3,7, 13, 19. 

Let f = 2° — 422? + 352 +m. Show that there are infinitely many integers m for 
which f is irreducible in Q(z]. [Hint: Eisenstein.] 

In each case factor f into irreducible polynomials in Q[z]. 

(a) f=a2t+ 32342743241 

(b) f = a4 +23 — 7a? + 32-2 

(c) f=a* +203 — 227 +72 —2 

(d) f=at—2° +22? —32+42 

If m and p are integers with p prime, show that «++ mz-+ p: is irreducible in Q[z] 
if and only if it has no root in Q. 

(a) Factor 2° + a +1 as a product of irreducible polynomials in Q[z]. 

(b) Factor 2° + 32+ 1 as a product of irreducible polynomials in Qfz]. 

Let F C K& be fields and let f and g #0 be in Fiz]. If f = gg for some gq € K[z, 
show that actually g € F(z]. [Hint: Division algorithm.] 


4.3. Factor Rings of Polynomials over a Field 227 


39. In each case compute d = ged(f,g), and express it in F'[z] as a linear combination 
of f and g. 
(a) f=a?+2,g=2°4+4e?+2r4+1F=Z5 
(b) fee? tlig=ei+ett+e?t+e2t+re4+1F=Z, 
(c) f=2*-x2-2,g=25 —423 — 22? + 72 -6;F =Q 
(d) f=e8+2-2,g=25—2t+22?-2z-1F=Q 

40. Prove the Corollary to Theorem 9. 

41. Let f and g be monic in Fz], F a field. Show that f divides g if and only if 
ged(f,g) = f. (Hint: Theorem 10.] 

42. If F is a field, let gcd(f,g) =1 in F[z]. Mimic the proof in Z to: 
(a) Show that, if f and g both divide h, then fg divides h. 
(b) Show that, if f divides gh, then f divides h. 

43. Let F be a field. A ring homomorphism o : F(z] — F'{z] is said to fix F if o(a)=a 
for allace F. 
(a) If b € F and o(f) = f(a +5) then o is a ring automorphism fixing F. 
(b) If0 4a€ F and o(f) = f(az) then o is a ring automorphism fixing F’. 
(c) Ifo: F{z] — Fla] is any ring automorphism that fixes F, show that 6 anda #0 
exist in F such that o(f) = f(ax +b) for all f in F[z]. 

44. Let F be a field, and let t+ f is a ring automorphism F — F. Given a polynomial 
f =a +aizt+-+:+an2” in Fla], define f in F[z] by f =Gp +G2+--++G,2". 
(a) Show that f+ f is a ring automorphism F[z] > F{z]. 
(b) If o: F[x] — F [a] is any ring automorphism, show that there exist a#0 and 
bin F, and an automorphism t+ of F, such that o(f)= f(aa +b). [See Exercise 
43] 


4.3 FACTOR RINGS OF POLYNOMIALS OVER A FIELD 


If F is a field the similarity in Section 4.2 between the factorization theory in F(z] 
and that in Z continues. Every ideal of Z has the form nZ = (n), the factor ring 
Z/nZ is easy to describe, and the generator n is uniquely determined if we insist 
that n > 0. This remains true in F'[z] : Every ideal A of F[z] is principal, that 
is A = (h) consists of all multiples of a polynomial h € F[z]. Furthermore, h is 
uniquely determined by A if we ask that it is monic. Finally, we give a simple, 
explicit description of the factor ring F'[z]/ (h) . 
We begin with the fact that ideals in F[z] are principal. 


Theorem 1. If F' is a field every ideal A of F [x] is principal. In fact, if A #0, a 
uniquely determined monic polynomial h exists in F [x] such that A = (h). 


Proof. If A = 0 then A = (0). If A £0, it contains nonzero polynomials and hence 
contains monic polynomials (being an ideal). Among all the monic polynomials in 
A, choose h of minimal degree. Clearly, (h) C A; we show that this is equality. If 
f is in A, the division algorithm (Theorem 4 §4.1) gives g and r in F'[z] such that 
f =qh+r and either r = 0 or degr < degh. We show that r = 0 (so f € (h)). 
Suppose r # 0, and let a be its leading coefficient. Then a+r is monic and 


atr=a[f-qh]|=a'f—atghe A. 
But deg(a~!r) = degr < deg h, contradicting the choice of h. Sor = 0 and A = (h). 


228 4. Polynomials 


To prove uniqueness, suppose that A= (k), where k is also monic. Then 
(kK) = (h), so each of k and h divides the other. As both are monic, k =h by 
Theorem 9 §4.2. ; _| 


Hence both F(z] and Z are examples of principal ideal domains, that is integral 
domains in which every ideal is principal. We say more about these in Chapter 5. 
If F is a field, Theorem 1 shows that the correspondence 


ho (h) 
is a bijection between the monic polynomials h in F[z] and the nonzero ideals of 
F{z]. Note that F'[z] = (1). Hence, our task in this section is to describe the factor 
rings F'[a]/(h) in as much detail as possible, where h is any monic polynomial. 


Example 1 is an important special case, and the method of analysis serves as a 
prototype for the general case that follows. 


Example 1. Describe the factor ring R = R{x]/A, where A = (x2? +1). 
Solution. The elements of R are cosets f + A, f € R{a], which we write as f = f +A 


for convenience. Hence, the operations in R are 
f+g=f+9 and f-G=fo. 
Given f € R[s], the division algorithm gives g in R[x] such that 
f=a(2?+1)+(a+bz), a,dER. 
Hence f —(a+bxr) € A, so f =a+bz =G@+ ba. Thus R can be described as 
R= {a+bz|a,be€ R}. 


The ring axioms define the operations of R when the elements are presented in this 
way. The addition is easy 


(a+ bz) + (€+ dz) = (@+0) 4 (b+ a)z. 
However, at first glance, the multiplication does not appear to be closed: 
(a + bz) (E+ dz) = Gé+ (ad + be)z + bdz?. 


The problem is £*. However, we have z2+1€ A, so @?+1=22+1=0. Thus, 


=? = —1 completes the description of the multiplication in R: 


(@ + bz)(E + dz) = (ae — bd) + (ad + be)z. 


Does this look familiar? If we denote % by a simpler symbol, say % = 7, then R 
looks like C except for writing @ in place of a for all a € R. But even this difference 
is no problem: The map a + @ is a one-to-one ring homomorphism R — R (verify), 
so we may identify R C R as a subring by taking @ = a for alla € R. Finally then, 
our description of R takes the form 


R={a+bi|a,beER, i =—1}. 
This is the ring C of complex numbers, created before your eyes! (The only thing 


remaining to check is that a+ bi = c+ di implies that a = c and b = d, and this is 
left to the reader). O 


The analysis in Example 1 extends to the general case. For clarity we break the 
argument into a series of lemmas. The notation we use is as follows: Let F be a 


4.3. Factor Rings of Polynomials over a Field 229 


field and let h € F[z] be a monic polynomial of degree m > 1. We write 
A= (h) = {qh|@€ Fla}} 


for the principal ideal generated by h, and denote the (commutative) factor ring by 


R= F{a)/A. 
Then R consists of cosets f + A, f € F[z], and we write them as in Example 1: 
FHfta 
In particular, @ = a+ A for each a € F, and we adopt the symbol t for x + A: 
t=Z=L+A. 


The operations in R are 
f+g=ftg and fg=fo. 
Lemma 1. R= {a9 +@t+-:-+Gpit™ "+ | a; € F}. 


Proof. A typical element of R has the form f, f € R[x]. By the division algorithm, 
q exists in F'[x] such that 
f=qht (ao + a0 +++++am12™"), a; € R. 
Because h € A, we have h = 0 in R, so 
fe=a9 tae to F Ome"! =O +Gtt + Git. a 
Lemma 2. The map 6: F > R given by 0(a) =G is a one-to-one ring homomor- 
phism. 


Proof. The map @ is a homomorphism because a+b=a+b, ab=ab, and 1=1+4A is 
the unity of R. To see that 0 is one-to-one, let (a) = 0. Then @ = 0, soa+A=0+4A, 
that isa € A. Ifa #0, then A = F'[z] because a is a unit in F(z], so 1 € A= (h). 
This implies that 1 = hf for some f in F[z] and hence that deg h = 0, contrary to 
assumption. Thus, a = 0 and @ is one-to-one. _| 


It follows from Lemma 2 that {@ | a € F} = 6(F) is asubring of R that is isomorphic 
to F. Hence, we may identify F = 0(F’) C R by taking @=a for all a€ F. This 
being done, Lemma 1 takes the form 
R= {ap + ait +-++++ amit? | a; € FH. 

Lemma 3 shows that the elements of R are uniquely represented in this way. 
Lemma 3. If ag + ayt +--+» +@m_1t™ } = bp + bit +--+ bmit™ + in R, then 
a; = b; for each 1.5° 
Proof. The condition gives 

(ao - bo) + (ay = by )t Ss iis Be (Qm—1 _ ba 4 ee =0. 
Hence it suffices to show that co + cyt +++-+¢m—1t™ ! = 0, c; € F, implies that 


c; = 0 for each 7. To this end, write k = cg + cya +++» + em_12™!. Then k € F[z] 
so it suffices to show that k = 0. First we have 


R=M+GE+ + Ga B= co ett + emat™ =0 


59Students of linear algebra will recognize this as showing that the set {1,t,t?,...,é 7} is 
linearly independent, and hence that it is a basis of R as an m-dimensional vector space over F. 


230 4. Polynomials 


in R. This means that k € A, so k = gh for some q € F[z]. If k # 0, this gives 
m—12>degk = degq+degh > degh =m, 
a contradiction. Thus, k = 0 in F[z], so c; = 0 for all 7, as required. i 
As in Example 1, the addition in R is straightforward in our new notation: 
(ap + yt + +++ + Gm—1t™*) + (bo + bit +++ + bm—1t™*) 
= (a9 + bo) + (a1 + bi )E +++) + (Qm—1 + bm—1)t™ 1. 


However, the multiplication involves powers of t higher than m — 1. In the case of 
the complex numbers in Example 1, we wrote t= 7, and the fact that 77 = —1 
enabled us to express the product in the form a+ bi. In that situation, we had 
h =a? +1, so the condition 1? = —1 is h(i) = 0. This holds in general. 


Lemma 4. In the ring R, we have h(t) = 0. 
Proof. Write h= co + cia +++: +¢m-sy2™'+2™. Recalling that t=2+A and 
that we are writing a =@=a-+A for all a € F, we compute in R: 
A(t) = co tert+-++++¢m—-it™ 1 4+4™ 
= COPE + +++ Gn e™) + E™ 
=h=0=0. a 


The following theorem summarizes all the information we have gathered. 


Theorem 2. Let F be a field and let h be a monic polynomial in F[z] of degree 
m > 1. Then the factor ring F[x]/ (h) is given by 


Fiz) 
(h) 

Moreover, this representation of the elements of F[x|/(h) is unique: 
Ag tat + +--+ mit t= botbyt + ++ + bm —1t™ I 


holds if and only if a; = b; for each 1.®° 


= {a9 +ait+-+++ anit" | a; € F; A(t) =O}. 


The formulation in Theorem 2 completely describes the ring F[az]/ (hk). Each 
element is uniquely represented in the form: 


ao +ayt+++++am_1t™ 1, a ER as 


where m = degh. Addition and multiplication of such expressions are given by the 
ring axioms, the operations in F, and the requirement that h is monic and A(t) = 0. 
These conditions on h allow us to express t™ in terms of lower powers of t, and 
hence to reduce all products in R to the form (*), as guaranteed by Lemma 1. Thus, 
the multiplication depends in a crucial way on the polynomial h. 

Example 1 describes the situation when F = R and h= x? +1, and the ring 
R[z]/ (x? + 1) turns out to be the ring of complex numbers C. (In general it is the 
ring F'(¢) mentioned in Section 3.2.) Examples 2-7 provide more illustrations. 


6°The reader may have noticed that the discussion leading to Theorem 2 makes little use of the 
fact that F is a field. In fact, Theorem 2 is valid for any commutative ring in place of F (Exercise 
30). 


4.3. Factor Rings of Polynomials over a Field 231 


Example 2. If F is a field, describe the factor ring R = F[zx]/ (x). 
Solution. Theorem 2 applies with h = x? and m = 2. Hence 
Fla)/ (x?) = {a+ bt | a,b € F;  =0}. 


Thus, the addition in R is (a+bt)+(c+dt)=(a+c)+(b+d)t, as before. However, 
because ¢? = 0, the multiplication is (a + bt)(c + dt) = ac + (ad + be)t. 

For a specific instance, take F = Zz = {0,1}. Then Za[z]/ (x?) = {0, 1, t, 1+ ¢} 
is a ring with four elements. Because 1+ 1 =0 in Zz and t? = 0, the addition and 
multiplication tables are as follows: 


+ | 0 1 t  l+t 
0 0 1 t  1+t 
1 1 0 i1+t ¢ 
: t 1+t 0 1 
1+t{1+t ¢ 1 0 O 


Example 3. Describe the ring Zo[z]/ (2? — 1). 


Solution. This is Theorem 2 with F = Zo, h = x? — 1 and m = 2. Here t? = 1 so we 
use the notation t = g because G = {1, g} is then the cyclic group of order 2. Hence 
Z2[x|/ (a? — 1) = {a + bg | a,b € Zo; g? = 1} and so consists of linear combinations 
of the elements of G with coefficients from Z2. As such, it is called a group ring, 
and is usually denoted ZeG : 


Z2G = {a+ bg | a,b € Zp; = 1}. 
The addition is the same as in Example 2 (with t = g), but the multiplication is 
(a + bg)(c + dg) = (ac + bd) + (ad + bce)g. 


The addition and multiplication tables for Z2G are as follows: 


1l+g9 
g g 1l+g 0 1 g 0 g 1+ 9 
l+g|li+g g 1 0 1l+g|}0 l1+g9 i+g 0 


Note that the additive group of ZG is isomorphic to the additive group of the ring 
R= Zz|x]/ (x?) in Example 2, but the multiplications are different. Nonetheless, 
the map 0: Z2G > R = Z,[2]/ (x?) given by O(a + bg) =(a+ b) + bt=a + W(1 +1) 
is a ring isomorphism as the reader can verify. O 


Group rings FG can be constructed for any field F and group G, and are im- 
portant in both the theory of rings and groups. In particular, if g* =1 and 
G = {1,9,97,...,9”*} is the cyclic group of order n, then 


FG = {a, +a1g+++++1n-19g" | a; € F and g” = 1} 
comes from Theorem 2 with h = 2” —.1. 


Example 4. Consider the ring R = Zg[z]/(v? +1). Here h=2°+1, so m=3 
and t? + 1=0 in Theorem 2. Hence 


R= {a4 bt + ct? | a,b,c € Zo; 2 +1= 0}. 


232 4, Polynomials 


Now |R| = 8 because (by Lemma. 3) there are two independent choices for each of 
a,b, and c in forming a + bt + ct?. Thus 


R= {0, 1, ¢, 2, 1+¢,14+¢7, ¢+¢?,1+¢+#7}. 


Because char Zp = 2, we have 1+1=0 andt?=1in BR. A typical calculation is 
(1+¢)\(L4+¢+t?) =1+4 24 22427 =1404+04+1=0. 


The reader should verify that both 1++¢+t? andt+t? are idempotents in R. O 
Ezample 5. Describe the ring R = Q[x]/ (x? — 2). 


Solution. Here h = x? — 2 andm = 2,s0 R= {a+ bt |a,b € Qt? = 2} by Theorem 
2. Clearly, this is (isomorphic to) the subring Q(./2) = {a + bV2 | a,b € Q} of R 
described in Example 4 §3.2. O 


In Example 4 §3.2, we showed directly that the ring Q(V2) is a field. In the 
present context this property follows immediately from the next theorem and the 
fact that «? — 2 is irreducible over Q. 


Theorem 3. Let h be a monic polynomial of degree m > 1 in F(x], where F is a 
field. The following conditions are equivalent: 

(1) Fla]/(h) is a field. 

(2) F[z|/(h) is an integral domain. 

(3) h is irreducible over F. 


Proof. (1) = (2). This is clear; every field is an integral domain. 

(2) = (3). For convenience write A = (h). If h = fg is a factorization in F'[x], 
compute (f + A)(g+ A)=fgt+A=h+A=0+A=0 in Fle]|/A. By (2), either 
f+A=0 or g+A=O; that is fe AorgeA. If fe A, then f =gh for some 
q€ F{z]. Hence h = fg = qhg, so (as F[z] is an integral domain) 1 = gg, which 
implies deg g = 0. Similarly, g € A implies deg f = 0. This proves (3). 

(3) > (1). Let f+A#0, where f € F[z]. Then f ¢ A, so h does not divide 
f. Let d = gced(h, f). Then d|h so, because h is irreducible and both d and h are 
monic, either d=1 or h =d (Corollary to Theorem 9 § 4.2). But h =d implies 
that h|f, contrary to f+A#0. Hence d=1, so (by Theorem 10 §4.2) u and v 
exist in F'[z] such that 1 = uh+ vf. Then (v+ A)(f+A)=1+A because h € A, 
so f + Aisa unit in F[2]/A. This proves (1). a 


Note that Theorem 3 is the analogue for Fx] of Theorem 7 §1.3 for Z. 
Example 6. If p is a prime, p = 3 (mod 4), show that a field of p? elements exists. 


Solution. It is a consequence of Fermat’s theorem (see Example 7, §4.2) that x? + 1 
is irreducible over Z, when p = 3 (mod 4). Hence, 


F = Z,|[z]/ (x? +1) = {a+ bt| a,b € Z,; t? = —1} 


is a field by Theorem 3. Moreover, in forming a typical element a + bt of F' we have 
p choices for a and then (by Lemma 3) p independent choices for b. Hence, there 
are p* choices in all, so |F'| = p?. The field F was denoted Z,(i) in Section 3.2. 0 


It is clear from the solution to Example 6 that, if h is monic and irreducible of 
degree m, then the field F'[a]/ (h) has exactly p™ elements. Moreover, it turns out 


4.8. Factor Rings of Polynomials over a Field 233 


that a monic irreducible polynomial of degree m exists in Z,[2] for every m > 1. 
Hence, we can construct a field of order p™ for all primes p and all integers m > 1. 
In fact, we obtain every finite field in this way. We return to this in Section 6.4. 
We note in passing that x? + 1 fails to be irreducible over Z, if the prime p is 
not congruent to 3 modulo 4 (Corollary to Theorem 8 §1.3). In particular, x? +1 
is not irreducible over Zo, and so will not yield a field (of order 4 = 27) by our 
construction. This is not a major problem since x? + z +1 is irreducible over Zy. 


Example 7. Construct a field of four elements. 


Solution. The polynomial x? + 2+ 1 has no root in Zs, and so is irreducible. Hence 
the required field is 


pe 2) 2 is eile be Za hte SO 


Thus, F = {0, 1, t, 1 +t} and #7 =t+1 (as 1+1=0 in Zz). The addition and 
multiplication tables are as follows. 


+ 0 1 t 1+¢ x 0 1 t 1+t 

0 0 1 t 1+t 0 0 0 0 0 

1 1 0 1+t t 1 0 1 t 1+ 

t t 1+t 0 1 t 0 t L+t 1 
1+é#]1+¢t t it 0 1+¢/0 1+¢ 1 t O 


Let F be a field and let f be any polynomial of positive degree in F[z]. Then 
f has a monic irreducible factor p € Fax] by the unique factorization theorem 
(Theorem 12 §4.2), say f =pg. Given p, Theorem 3 shows that F = F[z]/ (p) 
is a field that contains F' as a subfield (after identifying each a € F with the 
coset @ = a+ (p) in £). In addition, E contains an element ¢ such that p(t) = 0 in 
E. Hence f(t) = p(t)g(t) = 0 in E, so t is a root of f in E. Calling a field & an 
extension of F when F' C EF, we can state this assertion compactly: 


Theorem 4. Kronecker’s Theorem. If F is any field and f is any polynomial 
in Fa] of positive degree, there is an extension field of F in which f has a root. 


Theorem 4 is fundamental to the algebraic study of fields. Note that it not only 
proves that the extension exists, but also gives a precise form for its elements. We 
treat this topic in detail in Chapter 6. 

If F = Qin Theorem 4, then the fundamental theorem of algebra asserts that C 
is an extension of Q in which any polynomial of positive degree in Q|z] has a root. 
Hence, strictly speaking, we do not need Kronecker’s Theorem in this case. But no 
purely algebraic proof of the fundamental theorem is known; that is, every proof 
involves a limiting process at some stage. 


Exercises 4.3 
Throughout these exercises £ denotes a field. 
1. In each case find a monic polynomial h in F[x] such that A = (h). 


(a) A= {f € F{z] | The constant coefficient of f is zero} 
(b) A= {f € F[z] | The sum of the coefficients of f is zero} 


234 


» 


10. 


11. 
12, 


13. 


14. 


15. 


16. 


4. Polynomials 


(c) A= {f € Ze[z] | f(0) = f(1) = 0} [Hint: Theorem 10 §4.2\] 
(d) A= {f € Zs[z] | f(0) = f(1) = f(2) = 0} [Hint: Theorem 10 §4.2.] 


. In each case describe R = F[z|/(h) as in Theorem 2 and write out the addition and 


multiplication tables for R. 
(a)h=a2?4+1,F=Z, 
(b)h=a2?+2,F=Zy 
(c)h=a2?4+1,F=Zp 
(d)h=a2?-1,F=Zs, 
(e) h=a2?, F=Zz3 
(f)h=2*-2+1,F=Zs, 


. Construct a field of order 8 and write down the multiplication table. 
. Construct a field of order 9 and write down the multiplication table. 
. In each case construct a field of the given order. 


(a) 27 (b) 25 (c) 121 (d) 49 


. In each case determine all idempotents, nilpotents, and units in R = F[z]/(h). 


(a)h=a2*—a@ (b) h = 2? 


. In each case show that r is a unit in R = F[s]/(h) and exhibit the inverse. Use the 


notation of Theorem 2. 
(a)r=1+#t?, F=2Zyj,h=2341 
(b) r=1+t-,F=Z,,h=23+4+2?-1 


. Because x — a is irreducible over the field F', Theorem 3 asserts that F[z]/ (x — a) 


is a field. Describe this field. How is it related to F? 


. Find a subring of R isomorphic to Q[z]/ (a? — 2). 


F{a] re a b . 
(a) Show that 123) = { [% ‘| a,beE F} , @ subring of Mo(F). 
F(z] abe 
(b) Show that Ta) & 0 a b}\a,b,c€ F}, asubring of M3(F). 
0 0 a 


(c) Generalize. 

Find a ring isomorphism F'[z]/ (x? — x) = F x F. 

Let R= Flz|/ (x? —1) = {a+bt|a,be€ F; t? =1}. Show that a+ b¢ is a unit in 
R if and only if a? # b?. [Hint: If r= a+ bt let r* =a— bt, and N(r) =rr*. Show 
that (rs)* = r*s*, and hence that N(rs) = N(r)N(s), for all r,s € R.] 

(a) Let h = x? — va —u in Fa], where u and v are fixed in F. Define 


a b 

cen { i atty|[ab EF}. 
Show that S' is a subring of M2(F) and that F[z]/(h) & S. 
(b) Rework Exercises 10(a) and 11 in the light of (a). 
(c) Ifh = 2? +1 and F =R, obtain a subring of Mg(F) isomorphic to C. 
Let E = F[x|/(p), where p is irreducible over F. In each case factor p into linear 
factors in B{z}. 
(a) p=22+a+1,F=Zo (b) p= 2? +a7+1, F=Zo 
(c)p=a2°-24+1, F=2Zs; (d) p= 2? —a2? +1, F=Zs; 
If p is a monic irreducible quadratic in F'[z], show that p factors into linear factors 
over E = F[z]/ (p) . 
(a) Assume that 2 #0 in F and that m € F is such that 2° — m is irreducible over 
F. Write E = F{z]/ (x? —m). Show that 2? — m factors into linear factors in E[z] 
if and only if —3 is a square in F. [Hint: A quadratic c? + ra +s factors into linear 
factors over a, field if and only if the discriminant r? — 4s is a square in the field.] 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24, 


25, 


26. 


27. 


28, 


29. 


4.3. Factor Rings of Polynomials over a Field 235 


(b) Show that x? — 2 does not factor into linear factors over E = Q[z]/ (x3 — 2). 
Let F be a finite field, say F = {a1,a2,+- ,dn}. If m = (@ — a;)(x — ag) +++ (@ — an) 
and A={feFlx]|f(a:)=0 for all +=1,2,---,n} denotes the set of all 
polynomials in F[z] that annihilate F', show that A = (m). 
Let A denote the set of all polynomials in Z[z] with even constant term. Show 
that A is an ideal of Z[z] that is not principal. (Hence, Theorem 1 fails for integral 
domains in general.) 
If R is an integral domain for which every ideal of R[z] is principal, show that R 
must be a field. (Hint: Exercise 18.] 
Show that a field of order p* exists for every prime p. [Hint: Exercise 16 §4.2.] 
(a) If a? — 4b is not a square for a and b in a field F, show that z? +az+6 is 
irreducible in F[z].. 
(b) Show that the converse of (a) holds if 240 in F. 
Let f and g be nonzero polynomials in F[z]. 
(a) Show that A = {uf + vg | u,v in F[z]} is an ideal of F{z]. 
(b) Explain how Theorem 1 is related to Theorem 10 §4.2. 
Polynomials f1, fo,-:: , fm in F'[2] are called relatively prime if 1 is the only monic 
divisor of all of them in F(z]. Show that fi, fo,...,fm are relatively prime if and 
only if 1 = q, fitdefet:::+ 4mm for some q; in F(a]. [Hint: Theorem 1.] 
Let f and g be two nonzero polynomials in F [x]. By Theorem 12 §4.2, monic irre- 
ducible polynomials p,,po,...,p,r exist such that 

f=apt'py pf; OS fieZ,aeF 

g = bp{'p3? pe; O<g EZ, bEF 
Here we take f; = 0 if p; does not occur in the factorization of f (and write p? = 1); 


with a similar convention for g. Define a polynomial 


m= pmex(/ig1) pmax(fa.92) ms 


Show that the following hold. 

(a) m is monic. 

(b) m is a common multiple of f and g. 

(c) If g is any common multiple of f and g, then m divides gq. 

(d) m is uniquely determined by (a), (b), and (c). 

Then m is called the least common multiple of f and g, denoted m = Icm(f, g). 
Given f and g in F(z], let d= gcd(f,g) and m = lem(f,g) (Exercise 24). Show 

(a) (f) + (g) = (d) (b) (f).9 (9) = (m) 

(a) Let A + F[z] be an ideal of F[x], where F is a field. If A #0, show that A is 
prime if and only if it is maximal. 

(b) What happens if A = 0? Defend your answer. 

Let F be a field and let A be a monic polynomial in F'[z]. Show that F[a]/ (h) has 
no nonzero nilpotent elements if and only if h = pype---p,, where the p; are distinct 
monic irreducible polynomials. [Hint: Theorem 12 §4.2.] 

Let F be a field and let h#1 be a monic polynomial in F[z]. Show that every 
element of F'[x]/ (h) is either a unit or a nilpotent if and only if h =p", where n > 1 
and p is monic and irreducible. 


Let F be a field and let h= pg in F[z], all polynomials being monic. If p and g 


F F F 
are relatively prime in F[z], show that iz] — [Z| x ay [Hint: Exercise 25 and 


(h) — (p) —(@) 
Theorem 8 §3.4.] 


max(frsgr) 
: 


‘p 


236 4. Polynomials 


30. Prove that Theorem 2 is valid as stated for any commutative ring R in place of 
the field F. Identify the places where the proofs of Lemmas 1, 2, 3, and 4 require 
modifications and make the required changes. ; 

31. Let h be a monic polynomial of degree m in F[z], F a field. Let A = (h), and write 
R= Fiz\/A. Note that R can be written as R= {f(t)| f € F{z]}, wheret=a2+A 
as in Lemma 1. 

(a) If J is an ideal of R, show that there is a uniquely determined, monic divisor d 
of h in Fz] such that 
I = {q(t)d(t) | q(t) € R} = { f(t) | d divides f in F[z]}. 
Thus, I = (d(t)) is a principal ideal of R. 
(b) If J, = (di(t)) , where d; is a monic divisor of h, show that I C J, if and only if 
d,; divides d in F{z}. 
(c) If h = db, where b is (necessarily) monic, show that 
T = (F(8) | F(E)B(t) = 0}. 
This asserts that every ideal of R is an annihilator. 
(d) If degd=m and degh=n, show that every element f(t) of J is uniquely 
represented in the form: 
f@® = aod(t) + aytd(t) tee On—-m—1t"™" 1 d(t), a, EF. 


4.4 PARTIAL FRACTIONS 


In calculus the first step in integrating a quotient f(x)/g(z) of polynomial functions 
is to express it in a simpler form by expanding as a sum of partial fractions. 
Students learn to find such expressions in specific cases but the reason they exist in 
general usually remains a mystery. This is clarified in this section. We begin with 
Example 1, showing how to use the theorem. 


Example 1. Expand eae 


(iti) as a sum of partial fractions. 


Solution. The theorem that we are going to prove asserts that real numbers (called 
constants) a,b,c, and d exist such that 


Q227-ae+1 — a b ce+d 


G=ifGtrl) =i" G@oip ae 
Once we know that they exist, we can routinely determine the constants. We 
multiply through by (x — 1)?(a? + 1) to clear denominators: 
Qn? —2 +1 = a(x —1)(x? +1) + O(a? +1) + (ce +. d)(z - 1)”. (*) 

We find the constant b quickly by evaluating at 1. The result is 2 = 2b; b = 1. If we 
evaluate at 0,2, and —1, we get 

1 = -a+b+d, 

7 = 5a+5b+4+ 2c+d, 

4 = —4a + 2b —4c+ 4d. 


Il 


I 


The result is a = d + b=1,c —3. Note that we may also obtain equations in 
the constants by comparing coefficients of like powers of x on both sides of (*). For 
example, the coefficients of x? are0O =a+e. O 


4.4. Partial Fractions 237 


We need Theorem 1 below in the proof of the main theorem, but it has inde- 
pendent interest (and is valid over an arbitrary ring). 


Theorem 1. Let p be a monic polynomial in Riz], R any ring. Given f € R[z], 
there exist uniquely determined polynomials r9,71,...,?m in R[x] such that 


foe Tp ee ap 


and, for each i, either r; = 0 or degr; < degp. 


Proof. If f =0 or deg f < degp, then f = ro does it. Otherwise use induction on 
deg f. By the division algorithm (Theorem 4 §4.1) write f = gp+ ro, where ro = 0 
or degro < degp. Then g#0 so, as p is monic, deg f = degq+degp > degg. 
Hence, by induction, g=r1+rept+:::+rmp™? for some m, where r; =0 or 
degr; < degp for each i. The required representation of f follows. We leave the 
proof that it is unique as Exercise 1. Hl 


Example 2. Given a € R, Theorem 1 asserts that each polynomial f € R[{z] has 
an expansion of the form f=ag+a1(x — a)+a2(x — a)?+--++am(x2 —a)™, where 
a; € R for each i. 


Note that, if R is commutative and 2,3,--- are units in R, we can show that the 
coefficients a; in Example 2 are given by a; = i f (a), where f© is the i*” formal 
derivative of the function f (see Section 6.4). This result is called Taylor’s theorem. 

If F is a field, the field of quotients @ of the integral domain F[z] consists of 
quotients £ of polynomials f,g € F[z], g #0, called rational forms over F' (see 
Theorem 5 §3.2). These forms are added and multiplied analogously to rational 
fractions. Working in Q, Theorem 1 enables us to prove the main result of this 
section. 


Theorem 2. Partial Fraction Expansion. Let F be a field, and let f,g € Fz}, 
where g #0. Then the rational form £ has a unique expansion as a polynomial 
plus the sum of a number of rational forms >; where the following hold: 

(1) p is irreducible in F{z]. 

(2) p* is a divisor of g and k > 1. 

(3) Either r= 0 or degr < deg p. 


Proof. We work in the field of quotients Q of F[z]. Given a rational form f €Q, 
the division algorithm shows that f =qg+ fi, where f; = 0 or deg fi < degg. 
Hence f= q+ fi so, passing to fi, we may assume that deg f < degg. Write 


g= pe pr -..pkm where each p; is irreducible in F[z] and each k; > 1. We need 
the following observation. 


Claim, £ = M4 Fay... + tm for polynomials hy in F[z]. 
g Py! Po. Pm 


Proof. Use induction on m, the result being clear if m=1. If m>1 write 
n= pe vas pk, Then gi and pe are relatively prime so (Theorem 10 §4.2) write 
1 = ug: + sp*, where u and s are in F[z]. Then 


f_ fluo +sptt) _ fu, fs 
9 Pegi pe 


238 4. Polynomials 


Now induction applies to 48. which proves the Claim. 
Given the Claim, it remains to expand * in the required form, where h and p:p are 
in F[a], p is irreducible, and k > 1. To this end, use Theorem 1 to write 


h=rotrip+rop? +++: +Tmp™ 


where, for each i, either r; = 0 or degr; < degp. Then ie has the desired form 
(possibly with a polynomial summand), proving the existence of the expansion. We 
omit the proof of uniqueness: | 


If it happens that deg f < degg in Theorem 2, the resulting expansion is the 
same except that there is no polynomial term. If p* occurs in the factorization of 
the denominator g into irreducible factors, it yields terms 


poop pk 
in the partial fraction expansion. Because r; = 0 or degr; < degp for each i, we 
have determined the form of this expansion, and only the coefficients of the r; 
remain to be calculated. For example, if deg f < 7, we have 
f _  az+b a ce +d % r of 8 eo t 
(w2?+0+1)2%(e@+1)8 2 +e+4+1  (u24+a041)2? x+1 (e411)? (e+1)8 
for an appropriate choice of the constants a, b,c,d,7r,s, and t. We give one more 
example. 
Ezample 3. Expand GFatiy(eFiy in partial fractions over R. 


Solution. Because x? +x+1 and x+1 are irreducible in R[{z], the form of the 


expansion is 
x ax +b ce +d e 


(@@+a+12(@+1) 2+a2+1 cs (a2? +a” +41)? sera 
Clearing denominators gives 
v= (az +b)(2? +0+1)(e@+1) + (cet d)(e141) +e(2? +241)’. 


Now Evaluating at — 1 givese=—1. 
Comparing coefficients of 2* gives 0 =a+e. 
Comparing coefficients of x gives 1 =a+2b+c+d-+ 2e. 
Evaluating at 0 and yields 0 =b+d-+e. 


Evaluating at 1 yields 1 = 6a + 6b + 2c+ 2d + 9e. 
The solution isa=c=d=1,b=0,e=-1. Oo 


The only irreducible polynomials in R{z] are linear and quadratic (Corollary to 
Theorem 4 §4.2), so Theorem 2 shows that every rational form over R is a sum of 
terms of the types 

z+b 
Polynomials, ee ’ and ae : 
(2? +re+s)* (c+r)* 
Tt turns out that all these forms (and hence every rational form) can be integrated 
by using only elementary functions. 


4.5. Symmetric Polynomials 239 


Exercises 4.4 


1. Prove the uniqueness in Theorem 1. 
2. In each case, express the rational form as a sum of partial fractions over R. 
x? —x2t+) 1 a2t+l1 etet+) 
Paces caste 5) paris cee Beda a 
© rete) ™ @rppety Osea © Grip 
3. Expand Gee as a sum of partial fractions over F, where the u; are 


distinct elements of the field F. 

La kk 
4. Using partial fractions, deduce that fe ny 
2(a-+1)--(2-+n) & (i) (a+k) 


4.5 SYMMETRIC POLYNOMIALS® 


For any ring R we can iterate the process of forming a polynomial ring to construct 
the ring R[x, y] = (R[z])[y]. Every member of R{z, y] has the form 
f(x,y) = po + pry + pay? +--- = Dysop;y", 
where each p; is in R[x] and the sum is finite. If we write p; = Uj>0a;;2", where 
the a;; are in R, then f(x,y) becomes a finite double sum: 
f(z,y) = Dirodjroaigr'y? (*) 
= doo + aiox + dory + a20%? + ayivy + ag2y” +++: 


Moreover, each of and y commutes with the other and with every element of R. 
‘Thus, we may interchange the role of x and y, that is 


This is called the ring of polynomials in the two indeterminates x and y. The reader 
should verify that the representation in (*) is unique in the sense that 


Daje'y) = Dbyaty’,  ifandonly if aj =i, for all i and j. 


if R=R, the elements of R[z,y] are the familiar polynomial expressions from 
calculus and geometry. 

If R is any ring, define the ring R[x;,...,2,] of polynomials in the indeter- 
minates 21,...,2n, recursively as follows 


Rigis.< ata = lei ej tna) eel for te 2: 


Hence, R[x1, v2] = (R[x1]})[x2] as above, R[x1, #2, 23] = (R[x1, x2])[x3], and so on. 
With induction, Theorems 1 and 2 §4.1 give immediately 


Theorem 1. If R is any ring, then R[x,,...,2,] is a ring. Moreover, if R is 
commutative or a domain, so also is R[r1,...,%n. 
By induction, the indeterminates 71,...,2%,, commute with each other and with 


all elements of R. Hence, the order in which the x; are adjoined to RF is irrelevant; 
that is, for all permutations o in the symmetric group S,,, we have 


Hits pla) Hears eek a. 


61The results in this section are needed in Sections 6.6 and 10.3, and nowhere else. 


240 4. Polynomials 


Moreover, induction shows that each polynomial in R[z1,..., 2p] is a finite sum 
f(x, eo Zn) = Dag, g,24 tee gin, Qiyete E R, 
where the sum is taken over all n-tuples (i1,...,i,) with each i, > 0, and where 


only finitely many coefficients a,,..;, are nonzero. This representation is unique in 
the sense that 

Lai, i, 2p vex gin = a ee ee wae gin 
if and only if ai,..8, = bi,..4, for all i1,...,%,. Again, this is easily established by 
induction. 

In discussing a polynomial in R[z1,...,2n], having names for the individual 
terms that make up the polynomial is useful. If 74; > 0,...,2, > 0 are integers, a 
polynomial of the form 

Ais ie 2a) Se Oe 


is called a monomial in R[x1,...,¢n] and am, a € R, is called a monomial term. 
Ifa # 0, the degree of am is defined to be deg(am) =i, +--:+i,. The degree of any 
nonzero polynomial in R[z1,...,2%,] is defined to be the maximum of the degrees 
of its nonzero monomial terms. A nonzero polynomial in R[x1,...,¢,] is called 
homogeneous if each of its monomial terms has the same degree. This notion of 
degree coincides with our earlier definition in R[z]. If f #0 in R[z,...,2n] and 
ky, ko,..., km are the integers occurring as degrees of monomials in f, we can write 
fas f=hy+hot-:::+hm, where each h; is homogeneous of degree k; (hi is 
the sum of all monomial terms in f of degree k;). These terms h,; are called the 
homogeneous components of /. 


Example 1. In R[z,y,z], deg(2x°yz”) = 6, deg(—az) = 2, and deg1 =0. The 
polynomials z+ y, cy+y?, and xyz 4+ 2°y + vy? + z3 are homogeneous. However, 
x* + 2yz + 22” is not homogeneous but has two homogeneous components: x? + 2yz 
and xz?, 


Given f = f(xi,...,2n) in Rlz1,...,¢] and ay,...,a@, in the center Z(R) of 
R, we evaluate f(a1,...,@,) as in Section 4.1. In fact the evaluation map 


Ria,...,2n] ~R given by f(x1,---,2n) + f(ai,...,4n) 


is a ring homomorphism. This result follows by induction on n: If n=1, it is 
Theorem 5 §4.1; If n > 1, the map is a composite of the mappings 


Ria1,...,%n] = (Ri[ai,.--,2n-1])[tn] ~ Rlai,...,2n-1] — R, 


where the first map is evaluation at a, and the second map is evaluation at 
(a1,...,@,-1). Both are ring homomorphisms: the first by Theorem 5 §4.1, be- 
cause a, is central in R[x,,...,%,-1], and the second by induction. Hence, the 
composite is a ring homomorphism. 

If n > 2, a polynomial in R{w1,..., 2] can have more than one monomial term 
of maximal degree. This means that the notion of degree is not as useful here as 
it is for polynomials in one indeterminate. What we need is a way of ordering the 
monomials themselves. If 


p Satay «ee and gjap ae a 


are monomials, we write p <q if and only if p=q or pp < qn, where k is the 
smallest integer t with p, # q. We write p < q if p <q but p # q, and in this case 


4.5. Symmetric Polynomials 241 


we say that q is higher than p. This is a total ordering of the monomials (Exercise 
19) called the lexicographic (or dictionary) order. Thus, the ordering of two 
monomials (written as xj'x;?---a2'n) is determined by the exponents in the first 
place from the left in which they differ, as for words in a dictionary. 


Example 2. Order the set {x?x$x?, x2a3r3x3, x?x3x4, 22223} of monomials in 


R[x1, £2, £3, La]. 


Solution. x29 3 2 oh a3 Oo 


aSatal < a2alata? < cfadakad < stedahe. 

If f #0 in Ri[zj,..., xz], let p be the highest monomial appearing in f. If p has 
(nonzero) coefficient a € R, then ap is called the highest term in f and is denoted 
ht(f), and a is called the highest coefficient of f. 


Ezample 3. If f(a, 22) = 42122 — 3a123 + 323, then ht(f) = —32122. 


The next result is a generalization of Theorem 3 §4.1 and is needed in the proof 
of the Fundamental Theorem (Theorem 4 below). An element a € R is called a 
nonzero divisor if ar = 0 and sa = 0 can only happen in R ifr =0 ands=0. 


Theorem 2. Let f and g be nonzero polynomials in R{r,...,@n]. If the highest 
coefficient of one of them is a non-zero-divisor, then ht(fg) = ht(f) - ht(g). 


Proof. Let ht(f)=rp and ht(g)=sq, where r#0#s in R, and write 
p=ah)---aP= andg =x} .-- 2%, We must show that ht(fg) =rspq. With rs # 0 
by hypothesis, it suffices to show that, if a and b are monomials in f and g, with 
either a <p or b <q, then ab < pg. Write a= xf) --- 2%" and b= a?! . »+ gbn and 
assume that a <p, say ay < px, where k is minimal such that a, # b;. Because 
b <q, there are two cases. 


° Case 1.b=q. Then b; = q; for all 7, so a, + by < pe + Qx, where k is minimal, 
showing that ab < pq. 


°¢ Case 2. b <q. Now let b; <q, where I is minimal. If m is the smaller of | 
and k, the reader should verify that am + bm <Pm+4m, where m is minimal. 
Hence ab < pq in this case, too. A 


Symmetric Polynomials 


A polynomial f(21,%9,--+,%,) in R[x1,22,-+-,%] is called a symmetric poly- 
nomial if it is unchanged by any permutation of the indeterminates z;: 
Sf ity Boks. .0es Con) SF Ct Passa), for all o in Sy. 


Example 4. Every constant polynomial is symmetric. 
Example 5. 3; pied; is symmetric. If n = 3, this is 


£105 + 0123 + 2oU3 + cox? + agat + a3v3. 


Example 6. px(21,...,8n) = vi +a2k +---+a% is symmetric for k > 0, and is 
called the k-power symmetric polynomial. Note that po(z1,...,@n) = 7. 
Example 7. d(x1,%2,...,tn) = Wic;(ai — 2;)* is a symmetric polynomial, called 


the discriminant of the z;. For example, if n = 3, this is 


d(x, X2, £3) = (x1 = £2)" (x1 = £3)" (ae = x3)". 


242 4. Polynomials 


Let Rit, 21, x2,..., Zn] be a polynomial ring in n+1 indeterminates and consider 
the expressions: ; 
(t —a1)(t-— 22) = t? — (a1 + 22)t + 2122, 
(t _ x1)(t - x2)(t _ x3) = —- (x1 + @o+ x3)t? 
+(x129 +2423 + x203)t — £12923. 
If we regard these expressions as polynomials in t, the coefficients of powers of t 
are symmetric polynomials in the x;, because permuting the x; does not affect the 
left-hand side of the equations. The general definition is as follows. 


The elementary symmetric polynomials 89,51, 52,...,8n in R[x1,...,2n] 
are defined as follows: 
$4 (£1, 22, nee ere) = Lis cig es <i, Phy Lig “Dany for any k= ‘i 2, eA, 
Sil Big Was oan) eed 
Thus s;,(21, 2,...,£p) is the sum of all distinct products of k of the indeterminates. 


For example, 
81.(0 1523) 5554 En) = 2% +%o++::+2n, 
$n(21, LQ eee y La) = 21%2°°'Ln- 
If n = 4, we have 
82(%1,%2, 23,24) = 21 LQ +2123 4 L124 + Lox3 + Lor, + 3X4, 
$3(1, U9, 03,04) = Ly LoL3 + Ly Low4 + V1 AX3T4 + LoT3ZL4. 
Note that s, is homogeneous of degree k for each k = 1,2,...,7. 


One of the main reasons for the importance of the elementary symmetric poly- 
nomials is the way they are related to the roots of a polynomial. For example, 


(¢—a1)(t — 2g) =f — (a1 + 22)t + B12 = t? — 81 (21, 22)t + 82(a1, 22). 


Since s9(x1, %2,:++ ,&n) — 1, this expression generalizes as follows. 
Theorem 3. Write s, = 8g(%1,...,%n) for1<k<n. Then 
(t — £1)(t — 2) +++ (t — tn) = E™ — 8yt?1 F sot”? —- +8, = DR_g(—1)*s,t”*. 


Proof. The coefficient of ¢” is 1 = so. The expansion of the left-hand side is the sum 
of all products of n terms, one from each of the factors t — x;. If k > 1, each product 
involving t”-* has the form t"-*(—a;,)(—ai,) +++ (—ay,), where iy <ig <-> < ig. 
The sum of these terms is clearly ¢’-*(—1)*s,. a 


It follows from the definition that the set S of all symmetric polynomials in 
Rix1,...,2n] is a subring containing the constant polynomials. Hence every poly- 
nomial f(s1,...,8n) in the elementary symmetric polynomials (with coefficients in 
R) is again in S. The fundamental theorem shows that every symmetric polynomial 
has this form. 


Theorem 4. Fundamental Theorem of Symmetric Polynomials. Let R be 


any ring and let S' denote the subring of all symmetric polynomials in R[x1,..., Xn]. 
Then every member of S may be written in precisely one way as a polyno- 
mial f(s1,82,...,8n) in the elementary symmetric polynomials 5, = $%(@1,...,2n); 
where f(21,%2,...,2y) is in R[vi,...,2%,]. Thus the map 

J (xi, Be, S6 tn) re f (81, 82)- .+79n) 


is a ring isomorphism from R[x, %2,...,%n] onto S. 


4.5. Symmetric Polynomials 243 


Proof. Let g = 9(%1,...,%n) #0 be symmetric. If k1,...,km are the (distinct) 
integers that occur as degrees of monomials in f, then g = g; + +++ + 9m, where g; is 
homogeneous of degree k; for each i. Given og € S, and a monomial h(z1,...,2n), 
the fact that h(a1,...,0,) and h(xo1,...,2on) have the same degree shows that 
each g; is itself symmetric. Hence, we may assume that g is homogeneous. 

Let g be symmetric and homogeneous with highest term ht(g) = ap, where 
a#0 in R and p=aj".--a™-. Consider the transposition 0 =(k k+1) in 
Sn, where 1<k<n. Because g is symmetric, it contains the monomial term 
aq, where q= af) --- a, aPh,---am™=. Hence p> q by the choice of p, which 
means that m, > m,41 for each k and hence that my > mz >-:: > mn. But, given 
nonnegative integers k1, ko,..., kn, Theorem 2 implies that 

ht[s sB2... skn} = oP (ay a9)" (ayron3)* «+ (2129 +++ an) 
za gi tethe gkatthn ms ahaa thn Eno 


| 


Hence, the polynomial g, = as7¥1~ sf'27™8 ... g™n-1—™n gn has the same highest 
) Y. 1 2 n~1 n 


term as g, and so g — g; either is 0 or has a lower highest term than g. Since it 
clearly suffices to show that g —g: is a polynomial in the s;, we can repeat the 
process. A finite number of such repetitions yield g as a polynomial in the s,. 


Next, we prove the uniqueness of the representation. If in R[x, 22,--- , Zn] 
some polynomial can be expressed in two ways as a polynomial in $1, s2,-°: , Sn, 
subtracting gives an equation 

k 
Uki@kykn $1 7s skn =0, Cs) 
ki oko 


where all coefficients are nonzero. Now the polynomial s;1s,?-+-s*n has high- 
est monomial gaeienkn ghee thn -++ ay", which uniquely determines the integers 
k1,---,kn. Consequently, distinct monomials s** sk? ... sk» in the s; have distinct 
highest monomials in the x;. Choose the highest x; monomial arising in this way 
from the terms in (**). Then it occurs only once in (**) and with a nonzero coef- 
ficient. This contradicts the uniqueness of the representation of 0 in R[x1,...,@n] 
as a linear combination of x; monomials. 


Finally, the mapping 
R(z1,...,%n] > Rlxi,...,%n] given by f(a1,...,2n) + f(81,--+5 $n) 


has image S by the first part of this proof and is one-to-one by the uniqueness. 


Since the mapping is evaluation at s1,...,8,, it is a rmg homomorphism because 
each s; commutes with every element of R (all coefficients of s; are 1). Hence, S is 
a subring isomorphic to R{r1,...,¢p]. a 


The proof of Theorem 4 provides a method to actually express a symmetric 
homogeneous polynomial f as a polynomial in the s,. If ary"! .--2™ is the highest 
monomial term in f, subtract the term as{t~?sy'?~™8 ... gma Mn g™n from f, 
and repeat the procedure if the result is not a polynomial in the s;. Example 8 


demonstrates the method. 


Example 8. Express f(21, 22) = 2123 + x3z_ in terms of elementary symmetric 
polynomials. 


244 4. Polynomials 


Solution. Here n = 2 and f is homogeneous with highest term x}z2. Hence, 
f —s31sh = (a123 + xiao) — (a1 + te)? (x122) = 20222 = —282. 
Hence, we are done with one iteration in this case, and f = s?s2 — 283. O 


A method of undetermined coefficients is often easier to use than the tech- 
nique in Example 8. We let f be symmetric and homogeneous of degree n in 
R{zi,22,...,2n]. The proof of Theorem 4 shows that f is a linear combination 
(with coefficients in R) of polynomials ght sk ...gkn with degree m, that is, with 
ky + 2ko ++-+>+nky, =m. If f is as in Example 8, then m= 4 and n = 2, so the 
s; monomials in f have the form s* sk?, where ky + 2k: = 4. Hence, f itself has the 
form: 

f = as} + bs?s2 + cs, a,b, and cin R. 


Substituting (21,22) = (1,0) gives a=0; then (1,22) =(1,—1) gives c= —2; 
finally, (21,22) = (1,1) gives b = 1. Example 9 provides another illustration. 


Example 9. Express f(t1,...,2n) = >>; Zo x e3 in terms of elementary symmetric 
polynomials. 
Solution. Since f is homogeneous of degree 3, it has the form f = as? + bs;82 + cs3. 
Taking (21,...,%n) = (1,0,...,0) yields a = 0; then (21,...,2%n) = (1,1,0,...,0) 
gives b=1; and, finally, (v1,...,¢n) =(1,1,1,0,...,0) gives c= —3. Hence 
f = 8182 — 383. 0 
Note that the solution to Example 9 is based on the tacit assumption that 
n > 3 when expanding f (so sz can be written down). If n = 2, then f(x#1,%2) = 
rea + 012% = (1 + £2)(21%2) = $182, so the formula in Example 9 holds here 
too if s3(a1,%2) = 0. But any valid formula in R[x, 22,23] reduces to a formula in 
R{x1, v2] simply by taking x3 = 0. Thus, s3(21, 2,0) = 0, s2(x1, 22,0) = $2(21, 22), 
and s1(x1, £2,0) = 8; (a1, 22). Hence, the formula in Example 9 is valid even if n = 2. 
The k-power polynomials p;,(x1,°°* ,%n) = 2k +ak+--64 ak are symmetric 
and can be given in terms of si,..., 8, by formulas originating with Isaac Newton. 
The first three are 


Pi = 81, po = Sj — 28e, and —p3 = 8} — 38182 + 353. 
The first of these is clear and the others come from the following recursions. 
Theorem 5. Newton’s Identities. Let py = py(@1,°** , an) = ak + ak +-..4+a% 
denote the k-power symmetric polynomials. Then, for each k > 1, 
Pk = Pk-181 — De-282 +++> + (—1)*pisy_a + (—1)*t1ksg. 

Note the coefficient k in the last term. 
The proof is somewhat technical, and we present it at the end of this section. 

Hence, given pi = $1, the Newton identity with k = 2 gives 

p2 = pis — 282 = sf — 289. 
Then the case k = 3 gives p3 = po81 — piS2 + 383, which yields 
p3 = sf — 38182 + 383. 

Clearly, we can find p4,ps,... in the same way. 


4.5. Symmetric Polynomials 245 


If o is a permutation in S,,, the sign sgno of o is defined by 


1, if o is even, 
sgno = tsk 
—1, ifo is odd. 


Then we can easily show (Exercise 29 §1.4) that sgn or = sgno - sgn; that is, sgn is 
a group homomorphism S,, — {1,—1}. The following class of polynomials is closely 
related to the symmetric polynomials. 

A polynomial f(z1,...,%,) in R[z1,...,,| is said to be alternating if 


PUSS + op Bon) SSRN? Fl 2i, Baie g ea) for alla € Sy. 


Examples include (%1 — 22)a1Z2 and (#1 — 2)(x1 — v3)(x%2 — 23). We characterize 
these alternating polynomials where R is a domain with characteristic not equal 
to 2. 

As often happens, it is convenient to deal with a more general situation. Let 
f(x1,...,;2n) be a nonzero polynomial in R[x1,...,2,], and suppose that a mapping 
rT: Sn —- R exists such that 


Fai Gie sich) on) = Te) fy y aig a) for all a € Sp. 


Thus, f is symmetric if r(o) =1 for all o, and f is alternating if r(o) = sgno for 
all o. If R is a domain and € € Sy, is the identity permutation, it follows easily that 


r(é) =1 and =o r(or) = r(a)- r(r) 


for all o and 7 in S,. In particular, if 7 is a transposition (y? = ¢), then r(y) = +1. 
Since every permutation is a product of transpositions (Section 1.4), this shows that 
r(o) = +1 for all o, and hence that 

r:S,—2{1,-1}CR 
is a group homomorphism. 

Let K =kerr ={o0 € S, |r(o) =1}. As r(o?) =r(c)? =1, we have o? EK 
for all o € Sy. Hence, if a? =e, then ot =07 € K, so o EK. In particular, 
K contains every 3-cycle and thus A, C K. (A, is generated by the 3-cycles by 
Lemma 2 §2.8.) But A, has index 2 in S,, so A, C K means that either K = S, 
or K = Ay. If K = Sy, then r(c) = 1 for all o and f is symmetric. If K = Ay, then 
r(o) =sgno and f is alternating. This proves the first part of Theorem 6 below. 
To state it we need some terminology. 

The polynomial A, = A,(%1,...,2n) = [],<; (i — 23) is called the alternator 
of the variables z;. Thus, 


Ao(#1, £2) = (21 — £2) 
A3(#1, £2; £3) = (£1 — £2)(%1 — £3)(22 — 23). 
These alternators clearly have the property that 
An(€o1; £02) +++)Lon) = EAn(21,22,.--,2£n), for all g in Sp. 


If R is a domain, the preceding discussion shows that A,, is either alternating or 
symmetric. But ifo = (1 2), then A,(%o1,...,2on) = —An(#1,.-.,2n) because o 
negates (x1 — #2) and permutes the other factors of A,. Thus, A, is not symmetric 
and so must be alternating. 


246 4. Polynomials 


Theorem 6. Let R be a domain, let n > 2, and let 0 # f € Ri[x1,..., 2p]. 
(1) If for each o € Sn, f(¥o1,%02,...,Lon) =7(o)f(r1,22,...,2n) for some 
r(o) in R, then f is either symmetric or alternating. 
(2) Assume that char R #2. Then A,, is alternating, and f is alternating if 
and only if f = Ang for some symmetric polynomial g. 


Proof. It remains to prove (2). If f=Ang with g symmetric, then f is 
alternating because A, is alternating. Conversely, assume that f is alternating. 
If g=(1 2) in Sy, then f(#2,21,%3,...,0n) = ~f(%1, 02, 03,...,2n)., Thus, 
2f (1, 201,0%3,...,%n) =0, so f(r1,21,23,...,2n)=0 (as R is a domain and 
char R # 2). Now view f as a polynomial in S[x1], where S = R[x2,...,¢n]. Then 
£2 is a root of f in S, so f = (#1 — £2)h in S[x,| by the factor theorem (Theorem 
6 §4.1). In the same way, x3 is a root of f in S, so (as x3 # £2) it also is a root of 
h. This gives f = (21 — %2)(#1 — 23)k in S[x,], and eventually 
f(x1,...,8n) = f = (@1 — &2)(a1 — 2g)--+ ("1 — tn) fr (a1,...,2n). (2) 
We can now complete the proof by induction on n > 2. It is enough to show that 
f = Ang because that implies that g is symmetric (both f and A, are alternating). 
Ifn = 2, then (***) reads f = (a1 — v2) fi, = Agf,. In general, regard f1(21,...,2n) 


in (***) as in T[zo,...,2n], where T = R[x]. Then f; is alternating because 
(21 — Z2)(@1 — 23) +++ (1 — Zp) is unchanged when Zo, 23,...,2n are permuted. By 
induction f, = [Ile<i<;(x; — 2;)] g, so f = Ang, as required. a 


We conclude with the promised proof of Newton’s identities. 
Proof of Theorem 5. Write 
f(t) = (t-—21)(t — a) +--+ (¢ — ap) in Rit, 21,..., 2p]. 


Then Theorem 3 gives 


f(t) = t — st? 1 + sot™? 4-06 + (-1)" Sn. (1) 
Ifl<i<n, let sl! ) denote the kth elementary symmetric function of the n—1 
k 
variables x1,...,@j-1, i41,---,;2%n, where 2; is missing. Then we obtain 
FQ) = (ba) — Per? + Ses + + (1) ts 
Adding these equations for i = 1,2,...,n gives 
UAL, f(t (i) 
NET NEE whe 2 at Nees grt — [oe i n—-2 
fan fea t—2n e i=15] : 


za Banu t273 (1041 [Ee aat)| (2) 
Now the product rule of differentiation shows that 


(fifefs:+- fn)’ = (fi fofs-:: fn) + fi fofs:+: fn) +:++ + (fifefs ++: fh). 


Applying this rule to f(t) = (t — 21)(t — x2) ---(t — z,) shows that the left-hand 
side of (2) equals f’(t). Then differentiating (1) term by term and comparing co- 
efficients with (2) gives 


(n—k)sp= dos, &=1,2,...,n—1. (3) 
i=l 


4.5. Symmetric Polynomials 247 


Now evaluate the sum on the right a different way: Group terms in the sum for s, 
into those that involve x; and those that do not, which gives s, = si) +a,s0 as 
Hence 


sp = — me, k= 1, 2). nO 1. 


Iterating gives 3) = 8k ~ Li8p-1 + a2s),., Continuing in this way, and using the 
fact that si) = 1, yields 
st) = sq — a48p-1 + 2254-2 — ++» + (—1) Fa. 


Sum this expression from 7 = 1 to n and use (3) to get 


(n — k)8~ = 8p — piSe—1 + paSh—2 — +++ + (—1)* px, 
which gives the kth Newton identity. 
Exercises 4.5 
1. Describe the units in R{w1,--+ , tp]. 


2. In each case write f as the sum of its homogeneous components. 
(a) f(ay,z) = 0° + (w+ yz)? + (w—y)(wz +2 +3) 

(b) f(x,y, 2) = (@ —y)(@— 2) + (2? + 1)(y? + 22) + 2(xz + 3) 

3. Exhibit a polynomial in R[x, y] that is symmetric but not homogeneous, and one 
that is homogeneous but not symmetric. 

4, If Ris a domain and f and g are homogeneous of degrees m and n, show that fg is 
homogeneous of degree m+n. 

5. Let 6: R-» S be a ring homomorphism and let c1,c2,¢3,...,Cn be elements in the 
center of S. Show that there is a unique ring homomorphism @: R[z1,...,2n] - S 
such that 0(r) = 6(r) for all r€ R and 6(2;) =c; for all i. We say that 6 is an 
extension of 0 to R[x1,..., pn]. 

6. Show that f(r1,...,2,) is homogeneous of degree m in R[zi,...,0n] if and only if 
f(ta1,...,ttn) =t”f(21,...,2n) in Rit, 21,...,0,], ¢ another indeterminate. 

7. In each case order the monomials lexicographically. 

(a) xy a2%3, 2123, F303, T7Le 
(b) aygxg%4, £10304, CoE3, T1 4, 304 
8. In each case, express the polynomial f in terms of elementary symmetric 


polynomials. 
(a) f (x1, 22) = oe xia} (b) f (1, Z2, 23) = > apa? 
t# 5 t#g 
(c) f (v1, %2, 23) = Ss reir, (d) f (#1, £2, 23) = by Uta; 
i#GERHI i#5 
9. Show that the number of terms in s;,(a1,...,2n) is tar 
10. Show that the number of monomials of degree m in R[z1,...,2n] is ( ) [Hint: 


How many ways can you place m zeros and n — 1 ones in a row?] 

11. Write ps,ps5, and pg in terms of elementary symmetric polynomials. What does the 
formula for ps say if R = Z3? Can you make a conjecture about p, for any prime gq? 
If so, state it. 

12. Using the Newton identities (or otherwise), express the following polynomials in 
@1,02,...,2p in terms of the elementary symmetric polynomials. 


248 4. Polynomials 


(a) f(ai,..-,@n) = Yo (ei — 23)? (b) f(t1,-.-,%n) = D0 fay 
i<j t<j 
(c) {Gina tal =) eee : 
i<j 
13. Let the roots of x? — 5a? +42 — 3 be u,v, and w. 
(a) Find the polynomial with roots u?,v?, and w?. 
(b) Find the polynomial with roots 4,4, and +. 
14. Given o € S,, define 6, : R[z1,--+ ,2n] — Ri[e1,--- , an} by 
65[f (#1, on ,Ln)] = f (£01, bias \Len)- 


(a) Show that 6, is a ring automorphism of R[z,,--- , Zp]. 
(b) Show that o> @, is a group homomorphism S,, — aut R[1,-++ ,v,], which is 
one-to-one. 
(c) IfG C aut R[x1,:-+ , Zp] is asubgroup, show that Sg = {f | 6(f) = f for all 6 € G} 
is a subring of R[z1,--- , Zp], called the ring of G-symmetric polynomials. 

15. Let f(a,-++,%,) be a polynomial in Z,[v1,--+ , xy]. If f has degree less than p in 
each indeterminate x;, show that f(a1,--- ,a,) # 0 for some a; € Zp. 

16. Find a symmetric polynomial g(x,y) such that 2™y” — a"y™ = Agg(z,y). Assume 
that m > n. 


17. Suppose that p € R[z] is odd; that is, p(—x) = —p(x). If f(vi,--- , en) is any alter- 
nating polynomial in R[z1,--:,2,], show that fi(1,--+ ,@n) = pif (w1,°+: ,en)] 
is also alternating. If f = A,g, where g is symmetric, find a symmetric polynomial 
gi(@1,+++ ,%n_) such that f, = Angi. 

18. Let S and A denote, respectively, the sets of symmetric and alternating polynomials 
in Ria,,...,2n], and let T={f+g|f eS and g€ A}. Assume that n> 2, Risa 
domain, and char R # 2. Show that T is a ring, A, is central in T, T=S+A,T 
as additive subgroups, SM AnT = A?S, and T/(A,T) & S/(A?S) as rings. 

19. Write n-tuples in N” as a= (a1,d2,...,@n). Define the lexicographic order or 
dictionary order on N” by a <b if a=b or a, < by, where k is the smallest integer 
t with az # by. 

(a) Show that < is a partial ordering on N”; that is a <a for all a; a<b and 
b <a imply that a=b; anda < band b<c imply thata<c. 

(b) Show that < is a total (or linear) ordering; that is, a < b or b < a for all a and 
bin N”, 

(c) Show that < well orders N”; that is, any nonempty set of n-tuples has a smallest 
element. 

20. (a) Show that G= {a €R|—-1 <a< 1} isa group viaa*xb= aie 
(b) Show that a, #-+:+*%, = 2htsst-tsm if m is odd and a, *--+*2n= 


1+sete+sn-1 
sits3tetsn—1 
1+sate+sn 


if n is even. 


4.6 FORMAL CONSTRUCTION OF POLYNOMIALS 


If Ris any ring, we want to construct an indeterminant x over R, and so give precise 
meaning to the ring R[x] of polynomials over R. We construct R[x] as a subring of a 
larger ring S so that each expression ag + a1% + agx? +--+ + a,2" must be in S for 
any choice of a; € R. The elements a; of R (and xz) determine such an expression 
so, not surprisingly, we can construct S by using sequences from R. 

A sequence from a ring F is a function a: N > R. If we write a(m) = ay, for 
each m > 0, it is customary to display the sequence explicitly as ag, a1, a2,::: . We 


4.6. Formal Construction of Polynomials 249 


will denote this sequence by 
[Qm) a [ao, Q1,@2,°"° } 
If 6: N— R is another sequence denoted 6(m) = bm, then a = G if and only if 
a(m) = 8(m) for all m > 0, that is, if and only if a, = bm for all n > 0. In other 
words, two sequences equal when all the terms agree: 
[Gi = Bia), ifand only if  Gm=bm, for allm>0.° 
Now let S denote the set of all sequences from R: 
S ={[am) | @m € R for all m > 0}. 
We are going to make S into a ring. We begin by defining addition on the set S: 
[@m) + [bm) = [@m + bm). 


It is an easy matter to verify that S is an abelian group with this addition. The zero 
element is the constant sequence [0) = [0,0,0,---), and the negative of a sequence 
[am) is —[@m) = [—Gm) = [—ao, —@j, —@2,°°° ). 

The multiplication on S is convolution, defined as follows: 


[@m)lbm)= (pm) where Dm = Lipjamaid;, forallm > 0. 


Hence pm = Gobm + aibm—1 +++: +Gm-1b1 +Qmbo for each meEN. We leave 
to the reader the easy verification that the sequence [{1,0,0,---) is the unity for 
multiplication. Next, we check associativity. Given three sequences G = [am), 

=[bm), and @=[em), we write Gb=[pm), where pm = Di+jomaib;. Then 
Ge. (Pm) |Cm) = [rm), where 


Tm = LtpkomPiCe = Letom (Hi+jaraibs) ce = Dir 54+-ham (aids) ce. 


A similar calculation shows that @(b@) =[sm), where 8m = Di+j+b=mi(bjcp). 
Hence, the associativity of the multiplication in S follows from that in R. A similar 
verification (which we also leave to the reader) shows that the distributive laws 

a(b+ é@) = ab+ 4a, and (b+ @)4 = ba + 2 hold for all sequences G,b, and Zin S. 
Hence, S is a ring. 

To construct R[z] as a subring of S, we must first embed R as a subring of S. 
To this end, define 6: R— S by @(a) = {a,0,0,---) for all a € R. One verifies that 
6 is a one-to-one ring homomorphism so R is isomorphic to the subring 6(R) of S. 
We identify these two copies of R by writing 

a= 6(a) = [a, 0,0, 0, -+), 
which makes R into a subring of S. Finally, we define 
c= [0,1,0,0,---) 
in S and observe that 
az = {a,0,0,---)[0,1,0,---) = [0,a,0,---) = [0,1,0,---)[a,0,0,---) = aa 
holds for all a € R. Moreover, az? = [0,0,a,0,---), az* = [0,0,0,a,0,--+),-++, so 
70) +042 + az? + mia + dnt” = [a9, @1, @2,°°- Gn, 0, 0,° ze) 


for all a; € R. This result enables us to construct the polynomial ring R[z]. 


®°This includes the construction of n-tuples as sequences [a,,), where a,, =0 for allmon. 


250 4. Polynomials 


Theorem 1. Let R be any ring. There exists a ring S that contains R as a subring 
and contains an element x with the following properties: 
(1) az = 2a for alla eé R. 
(2) If ap +ayz + age? +++ + ane” = bo +12 + bon? +--+ +b,2” in S, then 
a; = b; for each i > 0. 
Hence x is an indeterminant over R by (2), so the subring 
R[x] = {a9 + av +--+ +an2" | n> 0; a; € R for each i} 


is the ring of polynomials over R and has all the properties required in Section 4.1. 
| 


The ring S is itself of interest. If x is as we defined it, we can write the sequences 
in Sas 


[@m) = a9 +412 + A907 +. = Te pasa", 


where infinitely many of the coefficients a; may be nonzero. Thus, S is called the 
ring of formal power series over R, and is denoted S = R[[z]]. The polynomial 
ring R[z] arises as the subring of all power series %a,;2' for which a; = 0 for all 
but finitely many i. 


Chapter 5 


Factorization in Integral Domains 


There still remain three studies suitable for free man. Arithmetic is one of them. 
—Plato 


We see therefore that ideal prime factors reveal the essence of complex numbers, make 
them transparent, as it were, and disclose their inner crystalline structure. 


—Ernst Eduard Kummer 


We have proved two unique factorization theorems: Every integer greater than one is 
uniquely a product of primes, and if F' is a field every polynomial of positive degree 
is uniquely a product of an element of F times a product of monic irreducible 
polynomials. In this chapter, we characterize the integral domains for which a 
similar theorem holds (called unique factorization domains, or UFDs) and discuss 
some important classes of UFDs.®? 

This theory has a long history and can be regarded as one of the original sources 
of modern abstract algebra. At the beginning of the nineteenth century, Gauss used 
the fact that the ring Z(i) (now called the gaussian integers) is a UFD to prove 
his law of biquadratic reciprocity, a method of determining when the congruence 
z* = b (mod n) has a solution. Inspired by the fact that 7 is a (fourth) root of unity, 
Kummer tried to extend Gauss’ work by considering Z(w), where w is any complex 
root of unity. However, he discovered that Z(w) may not be a UFD. This observation 
had other implications. In 1847, Gabriel Lamé announced that he had solved one of 
the most famous problems in number theory, usually called Fermat’s last theorem. 
It asserts that the equation 2” + y” = z” has no solution in positive integers z, y, 


63 Apart from Chapter 7, the material in this chapter is not essential elsewhere in this book. 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


251 


252 5. Factorization in Integral Domains 


and z for any integer n > 3. It is sufficient to prove this assertion ifn =p >3isa 
prime. If w is a pth root of unity, Lamé had factored x? + y? in Z(w) as 


a? + yP = (x +y)(c + wy)(z + wy) ++ (2+ wry) 


and then appealed to the (assumed) unique factorization in Z(w). 

Kummer responded by proving that unique factorization does hold in Z(w) 
for what he called ideal numbers. This proof led to verification of Fermat’s last 
theorem for many primes®*. However, Kummer’s work had far greater significance 
for modern algebra because his ideal numbers were what we now call ideals. The 
idea was taken up by Dedekind, who characterized the integral domains in which 
every nonzero ideal is uniquely a product (suitably defined) of prime ideals. 


5.1 IRREDUCIBLES AND UNIQUE FACTORIZATION 


The higher arithmetic presents us with an inexhaustible storehouse of interesting 
truths... between which... we continually discover new and wholly unexpected points 
of contact. 


—Car] Friedrich Gauss 


Recall that a ring R is called an integral domain if it is commutative and ab = 0 in 
R implies that a = 0 or b = 0. In this section, we are concerned with factorization 
of elements in an integral domain R. We say that an element a of R is factored in 
F if it is equal to a product of two or more elements of R. Some factorizations are 
in a sense trivial. For example, a = 1-a holds for all a. More generally, if u is a 
unit in R, then a = u(u7‘a), and a factorization a = ub, where u is a unit, is called 
a trivial factorization. Such factorizations are of no interest, and we regard two 
factorizations a = bc and a = (ub)(u7!c) as essentially the same. 

As for Z, if R is an integral domain and a,b € R, we write alb if b = ac for some 
c € R. In this case, we say that a divides 0 or that a is a divisor of b. Verification 
of the following properties is easy. 


(1) ala for alla € R. 
(2) If alb and dlc, then ale. 
(3) Ifalb and alc, then al|(rb + sc) for allr,s € R. 


If m and n are nonzero integers, we can easily verify that both m|n and n|m hold 
if and only if m = -n, that is, if and only if m = un, where u is a unit of Z. This 
holds in any integral domain R. Moreover, it is related to the set of principal ideals 
(a) = Ra generated by elements a of R. 


Theorem 1. If R is an integral domain, the following are equivalent for a,b € R: 
(1) alb and bla. 
(2) a =ub for some unit u in R. 


(3) (a) = (0). 


®4The “last theorem” remained open until 1997, when it was finally proved by Andrew Wiles. 


5.1. Irreducibles and Unique Factorization 253 


Proof. (1) > (2). If alb and bla, write b = va and a = ub. If a=0, then b= va = 0 
too, soa = 1b. If a #0, then a = u(va) = (uv)a implies that uv = 1 because R is 
a domain. Thus, wu is a unit. 

(2) = (8). If a= ub, then a € Rb, so Ra C Rb. Similarly, b= u-'a gives 
Rb C Ra. Hence, Ra = Rb, giving (3). 

(3) = (1). If (a) = (6), then a € (a) = (b) = Rb. Hence, bla; similarly, ajb. 


Let R be an integral domain. If a and b are elements of R, we write 
a~b ifand only if  alb and Dla. 


In this case, a and are said to be associates in R. Condition (3) in Theorem 1 
implies immediately that the associate relation ~ is an equivalence on R: 


(1) a~a for allae€ R. 
(2) Ifa ~ b, then b~a. 
(3) Ifa~b andb~c, thena~c. 


In this case, the equivalence class of a € R is 
[a] = {r |r ~ a} = {ua | wis a unit in R} = R*a, 


where, as usual, R* denotes the group of units of R. In particular, [0] = {0} and 
[1] = R*. If R=Z and n €Z, then [n] = {n, —n}. If R= Fiz], where F is a field, 
and f € R, then [f] = {af | 0 #.a € F} because F[z]* = F* = F ~ {0}. 

Note that the associate relation ~ in an integral domain is compatible with 
divisibility and multiplication in the following sense: 


(1) Ifa~d andb~J, then _—_a|b if and only if a’ |b’. 
(2) Ifa~a' andb~J, then abnda'd’. 


These facts will be used frequently; we leave the verifications as Exercises 2 and 5. 


Example 1. Show that /3 ~ (3 + 2/3) in the integral domain 
R=2Z(V3) ={m+nvV3|m,n€ Z}. 


Solution. We have 3+2V/3 = (2+ V3) . /3, and 2+ V3 is a unit in R (indeed, 
(2 + V3)(2 — V3) =). q 


We are interested in factorizations of elements of an integral domain R that are 
unique up to associates of the factors. Clearly, 0 must be excluded from considera- 
tion because 0 = 0- a holds for every a € R. Also, if u is unit and u = ab, then both 
a and b are units; that is, all factorizations of a unit are trivial. Hence, we consider 
only nonzero nonunits. If such an element is factored nontrivially, one of the factors 
may have a nontrivial factorization. If this factorization is carried out, factors that 
can be further reduced may still remain. This process suggests consideration of those 
nonzero nonunits that, like the primes in Z, admit no nontrivial factorization. 


254 5. Factorization in Integral Domains 


If R is an integral domain, p € R is called an irreducible element® (and is 
said to be irreducible in R) if it satisfies the following conditions: 


(1) p #0 and p is not a unit. 
(2) Ifp=ab in R, thena or b is a unit in R. 


An element that is not irreducible is called reducible. 

If R= Fa], where F is a field, this definition agrees with the notion of an 
irreducible polynomial used in Section 4.2. However, the irreducibles in Z are the 
elements of the form +p, where p is a prime. Note that a field has no irreducibles 
because no element is a nonzero nonunit. 


Example 2. If R= Z(i)={m+ni| m,n € Z} is the ring of gaussian integers, 
show that p = 1 +7 is irreducible in R. 


Solution. Suppose that p = ab in R. Taking absolute values gives |a|?|b|? = |p|? = 2. 
Hence, |a|? =1 or |b]? =1 because |a|? and |b|? are positive integers. If |a|? = 1 
and we write a=m-+ni, where m,néZ, then m*+n? = |a|? =1. Hence, 
a € {1,—1,%, —<}, so a is a unit in R. Similarly, |b]? = 1 implies that 0 is a unit. 0 


Example 3. Let R = Z(./—5) = {m+nV/—5 | m,n € Z}. Show that p= 1+ /—5 
is irreducible in R. 


Solution. If a= m+nv/—5, we define the norm of a to be N(a) = m? + 5n?. The 
reader can verify that N(ab) = N(a) N(b) for a, b in R. Now suppose that p = ab 
in R, so 6 = N(p) = N(a) N(b). Clearly, N(a) = 2 and N(b) =3 are impossible, 
which means that N(a) =1 or N(b) = 1. But then a = +1 or b= +1, so one of 
them is a unit. O 


The method in Examples 2 and 3 applies more generally, and we return to it in 
Section 5.2. First, we derive three useful conditions that make an element irre- 
ducible. The second is often taken as a definition of irreducibility. 


Theorem 2. If R is an integral domain, the following conditions are equivalent for 
a nonzero nonunit p in R: 

(1) p is irreducible. 

(2) Ifd\p, thend~1 ord~p. 

(3) 
(4) Ifp=ab in R, thenp~aorp~b. 


Proof. (1) = (2). If p = ad, then d or a is a unit by (1), sod~1lord~p. 

(2) = (3). If p ~ ab, then b|p, so b~ 1 or b~ p by (2). In the first case, p ~ a. 

(3) = (4). This is clear because p = ab implies p ~ ab. 

(4) = (1). If p= ab, then p~a or p~ b by (4). If p~a, write a = up, where 
u is a unit. Then p= ab=upb, so 1= ub (R is a domain). Thus, 6 is a unit. 
Similarly, p ~ b implies that a is a unit, so (1) follows. 


Ifp~abin R, thenp~aorp~b, 


S5Trreducible elements are also called atoms. 


5.1. Irreducibles and Unique Factorization 255 


An immediate consequence of Theorem 2 is that irreducibility is compatible with 
the associate relation ~ on &. More precisely, 


Ifp~q, then p is irreducible if and only if q is irreducible. 


We leave the proof as Exercise 16. 

We can now give conditions on an integral domain R under which all nonzero 
nonunits can be factored in some way as a product of irreducibles.®® For the mo- 
ment, call a nonzero nonunit “bad” if it cannot be written as a product of irre- 
ducibles. Suppose a is “bad.” Then a is certainly not irreducible, so 


a= 2141, a¢vaz, and axa, 


by Theorem 2. Now at least one of x; and a; is “bad” (otherwise both are products 
of irreducibles, so a is not “bad”). Suppose that a; is “bad.” Then, as before, 


a, = Z2a2, a, # tq and a; # a2, 


where de is “bad.” This process continues indefinitely. We have a € Ra, = (ay), 
so (a) C (a,). Similarly, (a1) € (a2), (a2) € (ag),..., and we obtain an ascending 
chain of principal ideals: 

(a) © (az) (az) C++ 


Furthermore, a # a1, @1  a2,...,80 (by Theorem 1) the chain is strictly increasing: 
(a) C (a1) C (ag) Coes. 


Hence, any condition on # that rules out such strictly increasing chains guarantees 
that R contains no “bad” elements. Thus, the following definition is germane. 

An integral domain R is said to satisfy the ascending chain condition on 
principal ideals (ACCP) if R contains no strictly increasing infinite ascending 
chain (a1) C (ag) C (a3) C -++ of principal ideals. The preceding argument proves 


Theorem 8. Let R be an integral domain that satisfies the ACCP. Then every 
nonzero nonunit in R is a product of irreducibles. 


The usefulness of Theorem 3 stems from the fact that the ACCP is easy to work 
with. One reason is the following useful alternative form of the condition. 


Lemma 1. The following conditions are equivalent for an integral domain R. 
(1) R satisfies the ACCP. 
(2) For any ascending chain (a,) C (az) C (a3) C --- of principal ideals in R, 
an integer n > 1 exists such that (an) = (@n4i) =-°:. 


Proof. (1)=>(2). Suppose (a1) € (a2) C --+ but no n exists such that (an) = (dn+z) 
for all k > 0. Then (a1) C (@n,) for some n1. Again, (an,) C (an,) for some ng. By 
Theorem 3 §1.1, this continues to give (a1) C (an,) C (@n,) C +++, contrary to (1). 
This proves (1)=(2). The proof that (2)=-(1) is left as Exercise 20. U 


Example 4. Show that Z satisfies the ACCP. 


Solution. If (a1) € (a2) € (a3) C-+- in Z, then aglai, aglag, .... Taking 
absolute values gives |a;| > |a2| > |a3| >--- . Since each |a;| > 0 is an integer, 


66Meaning a product of one or more irreducibles. 


256 5. Factorization in Integral Domains 


lan| = |@ngij=--: must hold for some n. But then aj4; = +a; for all i > n, so 
(ai) = (ai41) for allt > n, which is what we wanted. O 


More generally, we show in Section 5.2 that Z(w) = {m+ nw | m,n € Z} satisfies 
the ACCP for any complex number w such that w? € Z but w ¢ Q. 

An argument similar to the proof of Lemma 1 (using degree in place of absolute 
value) shows that F(z] satisfies the ACCP for any field F (Exercise 22). As F itself 
satisfies the ACCP (it has only two ideals!), this also follows from Theorem 4. 


Theorem 4. If R is an integral domain that satisfies the ACCP, then the ring R{z] 
of polynomials over R also satisfies the ACCP. 


Proof. lf not, let (f1) C (fe) C (fs) C +++ be a strictly increasing chain in R[x]. If 
a; denotes the leading coefficient of f; for each i, then ai41|a; in R because fi+i| fi, 
so (a1) € (ag) C (ag) C-+-. By hypothesis, let (@n) = (@n41) =-+:- for some 
n> 1; that is, Qn ~@ngi ~-++. If m>n, let fn =Gfmii, g € R[x]. If 6 is the 
leading coefficient of g, then Q@m = bam41, 80, 8 Gm ~ Gm4i, 0 is a unit in R. 
But g is not a unit in Riz] because (fm) # (fm+41), which means that deg g > 1. 
Hence, deg fm > deg fm+4i by Theorem 3 §4.1. This is true for all m > n, so 


deg fn > deg fri > deg fn4e PEE 3 


This is a contradiction since deg fm is a nonnegative integer for each m. It follows 
that R[s] satisfies the ACCP. | 


Hence, Example 4 shows that Z[z] satisfies the ACCP. More generally, if R is any 
integral domain satisfying the ACCP, then iterating Theorem 4 shows successively 
that the integral domains R[z], R{x,y] = (R[x])[y], R[x, y, z] = (R[x, y])[z],..., all 
satisfy the ACCP. 

Note that there exist integral domains in which the ACCP fails. For exam- 
ple, consider R= {n+2f|neZ, f € Q{z]}, the set of all polynomials in Q(z] 
whose constant term is in Z. Then R is an integral domain (subring of Q[z]), but 
(x) C (5@) C ($x) C-+- as is readily verified. This interesting example is explored 
further in Exercise 40. 


Unique Factorization Domains 


Factorizations into irreducibles are much more useful when we know that they are 
unique up to associates of the factors. An integral domain R is called a unique 
factorization domain (UFD) if it satisfies the following conditions. 


(1) Every nonzero nonunit in R is a product of irreducibles. 


(2) If pypo+++Dr ~ 192°** ds, Where the p; and the q; are irreducibles, 
then r = s and (after possible relabeling) p; ~ q; for each i. 


Thus, Theorem 12 §4.2 shows that F[z] is a UFD for any field F. Moreover, the 
field F is itself a UFD: Conditions (1) and (2) hold vacuously in this case because 
F contains no nonzero nonunits, and hence no irreducibles. Of course, Z is the 
prototype example of a UFD. Note that Theorem 7 §1.2 proves unique factorization 
only for integers n > 2 in Z. However, this theorem clearly extends to any integer 
nm < —2 because —p is irreducible for any prime p. 


5.1. Irreducibles and Unique Factorization 257 


If R=Z, scrutiny of the proof of Theorem 7 §1.2 shows that the uniqueness 
of the factorization into primes depends crucially on the primes p € Z having the 
following property: If plab in Z, then pla or p|b (Euclid’s lemma). However, irre- 
ducibles in an arbitrary integral domain need not have this property, which leads us 
to yet another definition. An element p in an integral domain R is called a prime 
element of R (and is said to be prime in R) if it satisfies the following conditions: 


(1) p #0 and p is not a unit. 
(2) If plab in R, then pla or p\b. 


Once again, being prime is compatible with the associate relation ~, that is, 
Ifp~qinkR then p is prime if and only if q is prime. 
We leave the verification as Exercise 16. 


Theorem 5. Every prime in an integral domain R is irreducible in R, but the 
converse fails for some integral domains R. 


Proof. Let p be prime in R. If p=ab in R, we must show that a or b is a 
unit. Clearly, plab, so pla or p|b by hypothesis. If pla, let a = dp. Then a = d(ab), 
so, as a #0 in the domain R, 1 = db and 6 is a unit. Similarly, p|b implies that a 
is a unit, so p is irreducible. Example 5 shows that the converse can fail. | 


Example 5. If R = Z(./—5) = {m+ nV—5 | m,n € Z}, show that p = 1+ /—5 is 
irreducible in R but not prime in R. 


Solution. Example 3 shows that p is irreducible in R. To see that p is not a 
prime, observe that 2-3 =6 = (1+ /—5)(1 — /—5) =p: (1— V—5) in R. If p is 
a prime, this implies that p|2 or p|3 in R. Suppose that p|2, say 2 = gp. Now write 
N(m+ nV/—5) = m? + 5n?, much as in Example 3. Again N is multiplicative, that 
is, N(xy) = N(x) N(y) holds for all 2 and y in R. Then 2 = gp gives 


4 = N(2) = N(q)N(p) = N(q) -6, 
which is impossible. Similarly, p|3 is impossible, so p is not a prime. O 


The following analogue of Euclid’s lemma (Theorem 6 §1.2) will be needed; the 
routine inductive proof extends and is left to the reader. 


Lemma 2. Let p be a prime in an integral domain R. If pl|ayag:--an, in R, then 
pla; for somei=1,2,...,n. 


Lemma 3. If R is a UFD, then every irreducible in R is prime. 


Proof. Let p € R be irreducible. If plab in R, write ab=pd. As R is a UFD, 
factor a, b, and d into irreducibles: a = p,--+ pp, b= G1 -++q, andd =11-++Tm. Then 
pd = ab becomes pry ++: Tm = P1°** PRG1°** Gi, 80 the uniqueness implies that either 
p~ p; for some i or p ~ q; for some j. Hence, pla or plb. a 


Hence, an element in a UFD is prime if and only if it is irreducible (using Theorem 
5), so factorizations into irreducibles are actually factorizations into primes. 

Many factorization properties of Z extend automatically to any UFD; that is, 
the analogous proofs apply. We are going to discuss several of these properties and 
leave many details to the reader. We begin by describing divisors. 


258 5. Factorization in Integral Domains 


Let R be a UFD and Jet a € R be a nonzero nonunit. Then a can be written 
uniquely (up to associates) as a product 


— 21722 a. 
a= py, Po repens 


where a; >1 for each i, each p; is a prime in R by Lemma 3, and the p; are 
nonassociates (that is, pj ~ p; if i# 7). Uniqueness means that the integers r, 
G1, ..-, @p are uniquely determined by a, as are the primes p; up to associates. We 
claim that the divisors d of a are also determined uniquely (up to associates): 


dja ifandonlyif = d~ p'p®...p4, where 0 <d; <a; foreachi. (*) 


Clearly, each d of this form is a divisor of a (possibly a unit).6’ Conversely, if d|n, 
then every prime divisor of d must be associated with one of the p; by Lemma 2, so 
the prime factorization of d takes the form d ~ pe p2 .--pt, d; > 0. Similarly, if 
a=dbin R, then b= phi pe . - per, b; > 0, so 
rhe oe a “per =a=dbw~w~ parts pdatba oh -partor , 
Uniqueness implies that a; = d; + b; for each i, so d; < a;, as asserted in (*). 
Next, we define greatest common divisors and least common multiples in a UFD 


just as we did in Z. Let s,s2,...,5, be elements of an integral domain R. An 
element d of R is called a greatest common divisor (ged) of s;,...,8,, denoted 
gcd(si,...,n), if it satisfies the following conditions: 


(1) d|s; for each i = 1,2,...,n. 
(2) Ifr € R and r|s; for each i =1,2,...,n, then rld. 


Analogously, m € R is called a least common multiple (lem) of s1,...,8n, 
denoted by lcm(s1,...,5n), if it satisfies 
(1) s;|m for each i =1,2,...,n. 


(2) Ifr € R and s,\r for each i = 1,2,--- ,n, then mr. 


These definitions agree with those previously defined in Z and F'[x], except that they 
are required to be positive in Z and monic in F'{z]. These extra conditions ensure 
uniqueness in Z and F'[z], respectively, but no such device is available in an arbitrary 
UFD. However, ged(si,..., $n) and lem(s1,...,8,) are uniquely determined up to 
associates in any integral domain R: 


If s; ~~ 5, foreachi then  gced(si,...,8,) ~ gced(s4,...,8/,). 


A similar remark applies to least common multiples, and we leave the details to the 
reader (Exercise 25). Because we are ignoring the distinction between associates, 
we denote any greatest common divisor of s1,..., 8, simply by gcd(si,...,5,) and 
any least common multiple by lem(si,..., 8n). 

Theorem 6 guarantees the existence of gcds and lems of nonzero elements 
in any UFD (see Exercise 24). 


8?Note that allowing zero exponents in (*) includes divisors where some prime p, is missing. 


5.1. Irreducibles and Unique Factorization 259 


Theorem 6. Let R be a UFD, and let a,b,c,... be a finite list of nonzero elements 
in R. If p\,pe...,pr are the nonassociated primes dividing one of a, b,c,..., write 
an piips?::-per, a; >O0in Z, 
bw pPt pb? .. «phe, b; > Oin Z, 
c~ pips? + pe, cc > Oin Z, 


where an exponent is zero if the corresponding prime does not appear.®® If we define 
d; = min(a;, bj, ci,...) and m; = max(a;,b;,c;,...) for each i = 1,2,...,7, then 


ged(a,b,c,...) ~ ppl... pr and lem(a,b,c,.. ov ppp? ++ pr, 
Proof. The proof of Theorem 9 §1.2 carries over. & 


Warning. If R is a UFD and d = gcd(a, b) in R, it may not be possible to express 
d in the form d = za + yb for some x and y in R. This conclusion holds if R = Z or 
R= F{z], F a field, but it need not hold in general. One condition guaranteeing that 
the condition holds is if every ideal of R is principal; we consider these “principal 
ideal domains” in Section 5.2. 

Notwithstanding the warning, many properties of greatest common divisors 
familiar from the integers remain valid in any UFD, even though the method of 
proof is different. Example 6 illustrates this point. Compare the argument with the 
proof of Theorem 5(1) §1.2. 


Example 6. If alc, b|c, and gcd(a,b) = 1 in a UFD, show that ab|c. 


Solution. As in Theorem 6, choose primes pj,...,p, such that a ~ pip? --- pe, 

b ww ph pb? per, and c~ pi pS --- pS, where a; > 0, b; > 0, and c¢; > 0 for all i. 

Then alc and dlc give a; < c; and b; < c;, respectively, for alli, whereas gcd(a,b) = 1 

means that min(a;,b;) = 0 for all 7. Thus, a; = 0 or b; = 0 for all 2, so a; + 8; is a; 

or b;. In particular, a; + 6; < ¢; holds for all 7, whence ab|c. O 
We need one more fact about gcds in an arbitrary integral domain. 


Lemma 4. Let R be an integral domain and let a,b,c be nonzero elements of R. 

If gcd(a,b) and gced(ca, cb) both exist in R, then gcd(ca, cb) ~ cgcd(a, b). 

Proof. Write d = gcd(a, b) and d’ = ged(ca, cb). Then dla and db, so cd|ca and cd|cb. 

Hence, ed divides gcd(ca, cb) = d’, say d’ = ucd. So we show that u is a unit. 
Write ca = d'a, 2 € R, so ca = ucdz. Hence a = udz as c $= 0, so udla. Similarly, 

ud|b so ud divides ged(a, b) = d. If vud = d, thenvu=lasd#0,souisaunit. @ 


We can now prove our characterization of unique factorization domains. 


Theorem 7. The following are equivalent for an integral domain R: 
(1) R is a UFD. . 
(2) R satisfies the ACCP, and gcd(a, b) exists for all nonzero a,b € R. 
(3) R satisfies the ACCP, and every irreducible element in R is prime. 


Proof. (1) => (2). Theorem 6 shows that gcds exist in R. Suppose, if possible, that 
(a1) € (az) C (ag) C--+ in R; we look for a contradiction. We may assume that 


68Tf a is a unit, for example, then a; = 0 for each i. 


260 5. Factorization in Integral Domains 


a, #0. Moreover, a; is not a unit (because (a1) # R). So let a, = pk pk? ..-pkr, 
where p; are nonassociated primes and k; > 1 for each 7. We have a,|a for all 7, so 
air pa pi ..-pér for 0 < d; < k;. Thus, there are only finitely many nonassociated 
possibilities for the a;, and so there must exist m#n with am ~ ap. But then 
(am) = (a), a contradiction. Hence, R satisfies the ACCP. 

(2) = (3). Let p € R be irreducible and assume that plab; we must show that 
pla or p|b. By (2), let d = gcd(a,p). Then dlp, so d~ p or d~ 1 by hypothesis. In 
the first case, pla because p ~ d and dja. In the second case, gcd(a, p) ~ 1. But then 
Lemma 4 gives gcd(ab, pb) ~ b, so p|b because plab and p|pb. This proves (3). 

(3) = (1). Given (3), each nonzero nonunit is a product of irreducibles by 
Theorem 3, so it remains to show that such factorizations are unique. If not, let 

P1p2'**Pr ~ G192°"' Qs 


be distinct factorizations, where the p; and q; are irreducibles and r + s is as small 
as possible. If r= 1, then p; = qig2---qs and it follows that s = 1 because 7, is 
irreducible, a contradiction because the factorizations are distinct. So we may 
assume that r > 2 and s > 2. Then pi|qigo---gs, so pi divides one of the g; by 
Lemma 2. By relabeling the q;, assume that p;|q1. Since q; is irreducible, this implies 
that pi ~ q1, whence po--:pr ~ go°+-gs are distinct factorizations, contradicting 
the minimality of r+ s. | 


Unique Factorization in R[x] 


We conclude this section with a proof that R[z] is a UFD whenever R is a UFD. 
Theorem 7 guarantees that R satisfies the ACCP, and so the same is true of R{[z] 
by Theorem 4. Hence, by Theorem 7 again, all that remains is to show that irre- 
ducible polynomials in R{x] are primes. This task is surprisingly difficult. Part of 
the problem is that irreducible elements of R remain irreducible as polynomials (of 
degree 0) in R[x]. The following definition helps to circumvent this difficulty. 

If R is a UFD and f is a nonzero polynomial in R[x], the greatest common 
divisor of the nonzero coefficients of f is called the content of f and is denoted 
c(f). And f is called a primitive polynomial if c(f) ~ 1. 


Example 7. In Z[z|, c(6 + 10x? + 1523) = 1 but c(6 + 9x? + 1523) = 3. 


If c=c(f) is the content of a nonzero polynomial f, then c divides every co- 
efficient of f and so f = cf, where f, is a polynomial uniquely determined up to 
associates by f. Moreover f, is primitive, which is the first part of Lemma 5. We 
leave the proof as Exercise 33. 


Lemma 5. Let R be a UFD and let f #0 be a polynomial in R{z]. 


(1) f can be written as f = c(f) fi, where fy € R[z] is primitive. 
(2) If0 #a€ R, then c(af) ~ ac(f). 


Lemma 6. If R is a UFD and p € R{z] is irreducible with degp > 1, then p is 
primitive, 


Proof. Write p = cpi, where c= c(p). Then c or p; is unit in R[z] because p is 
irreducible. But p; is not a unit because deg p, > 1, so c is a unit. Thus, c~ 1, 
which means that p is primitive. 0 


5.1. Irreducibles and Unique Factorization 261 


The following theorem, first proved by Gauss at the end of the eighteenth cen- 
tury, is the key to proving that R[z] is a UFD whenever R has this property. 


Theorem 8. Gauss’ Lemma. Let R be a UFD. If f #0 and g #0 in Riz], then 
e(fg) ~ e(f) e(g). 


In particular, the product of primitive polynomials is primitive. 


Proof. Let f =c(f)- fi and g = c(g)- 91, where f; and g; are primitive. Then 
e(fg) ~ ele(f)e(9) fig] ~ e(f)e(g)e(fig1) 


by Lemma 5, so it suffices to prove the result when f and g are primitive. Hence, 
assume that c(f) ~ 1 ~ c(g) and suppose that fg is not primitive. Then some prime 
p divides each coefficient of fg. Write f=apt+tajyce+-:: andg=bo+biyz+::-. 
Because f and g are primitive, p does not divide every a; (or every b;), son >0 
and m > 0 exist such that 


e p does not divide a,, but pla; for 0 <i<n, and 
e p does not divide b,,, but p|b; for 0 <j <m. 


The coefficient of 2+” in fg is c= Ui+;<m+naib;. Thus, plc and p divides every 
term a,b; except possibly anbm. But then planbm too so pla, or plbm because p is 
prime. This contradiction proves Gauss’ lemma. | 


Our first use of Gauss’ lemma is to prove Theorem 9, which, although useful in 
itself, is needed in the proof of our main result (Theorem 10). 


Theorem 9. Let R be a UFD with field of quotients F. Regard R C F as a subring 
of F as usual. If p € R{z] is irreducible in R[x], then p is irreducible in Fz]. 


Proof. Let p be irreducible in R[az] and assume that p= gh in Fa]. If a and 6b 
are the products of the denominators of the coefficients of g and h, then g; = ag 
and hy = bh are in R[x], and abp = gh is a factorization in R[x]. Moreover, p is 
primitive in R[x] by Lemma 6, so Gauss’ lemma gives 


ab ~ abc(p) ~ c(abp) = e(gihi) ~ c(gi)e(hi). Cy) 


Now write g; = c(gi)go and hy = c(h1)ho, where go and hg are primitive in R[z]. 
Hence, abp = gihi = c(h1)c(g1)hege, so (**) implies that p ~ hago in R[x]. But then 
hg or go is a unit in Riz]. If go = wis a unit in R, then bg = gi = c(gi)g2 = c(gi)u, 
so g = b-+c(g,)u is a unit in F[z]. Similarly, if he € R*, then h € F[z]*. @ 


Note that the converse of Theorem 9 is not true. For example, 3(a? +1) is irre- 
ducible in Q[z] but not in Z[z]. 
We can now prove the most important theorem of this section. 


Theorem 10. If R is a UFD, the polynomial ring R|z] is also a UFD. 

Proof. By Theorems 4 and 7, it suffices to show that every irreducible p in R[z] is 
prime. Accordingly, assume that p|fg in R{z]; we must prove that p|f or plg. 
Claim 1. It suffices to prove that p|f or p|g when f and g are primitive. 


262 5. Factorization in Integral Domains 


Proof. Let hp = fg, where h € R[x]. By Lemma 5, write f =afi, g = bgi, and 
h =dh,, where a, b, and d are in R and fi, gi, and hy are primitive in R[z]. 
Because p is also primitive (by Lemma 6), Gauss’ lemma gives 


d~ c(h) ~ c(hp) ~ e(fg) ~ e(f)e(g) ~ ab. 


Hence, hyp ~ figi because dhip = hp = fg = abfig;. Hence, p|figi so, as f; and 
gi are primitive, our assumption implies that p|fi or pig. If fi =kp, then 
(ak)p = af, = f, so p|f. Similarly, p|g; implies p|g, proving the Claim 1. 


So assume that hp = fg, where f and g are primitive in R[z]. Let F denote the 
field of quotients of R and, as usual, regard R C F as a subring of F. Then pl|fg 
in F'[z], so, as p is irreducible in F[z] by Theorem 9, Theorem 11 §4.2 gives p|f or 
pig in Fla], say f = kp, k € Fl]. If d is the product of all denominators of nonzero 
coefficients of k, then gg = dk € Rix] and we have df = gop. But f is now assumed 
to be primitive, so Gauss’ lemma, gives 


d~ c(df) ~ c(gop) ~ c(go)e(p) ~ c(go). 


If we write go = c(go)g1 where gi € Riz], then df = gop = c(go)g1p. As d ~ c(go), 
it follows that f ~ gip, so p|f in R[z], as required. @ 


In particular, Z[a] is a UFD, a result first proved by Gauss. 

If R is a UFD, then R[x] is also a UFD by Theorem 10, so the theorem shows 
that the ring R[x, y] = (R[z])[y] of polynomials in two commuting indeterminates 
is also a UFD. More generally, define the ring R{xj,--: ,2n] of polynomials in n 
commuting indeterminates inductively by 


Rizi,+++ Sn, tn4i] = (R[zi,+++ ,2n])[2n41] 


for each n > 1. Then iterating Theorem 10 gives 


Corollary 1. If R is a UFD, so also is R[z1,--+ , vp] for each n > 1. 


Exercises 5.1 


Throughout these exercises, R is an integral domain unless stated otherwise. 


1. f0#a= bc in R, show that a ~ b if and only if c~ 1. 
2. Ifa~a’ and b~ U in R, show that ab if and only if a’|b’. 
3. In the ring Z(z) of Gaussian integers, show that 
(a) (2+%) ~ (1 — 22) (b) (1+ 2i) ~ (2+%) 
. Show that (1 — V5) ~ (7 — 35) in Z(v5). 
. Ifa~ad and b~J' in R, show that ab~ a’. 
. Show that an integral domain is a field if and only ifa~ b for ala#O45. 
. Find the units in Z(./—5). [Hint: Example 3.] 
. Find the units in Z(./—3). [Hint: Use N(a + b\/—3) = a? + 3b? as in Example 3.] 


OI AH HMA 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24, 


25. 


26. 


27. 


28. 


29. 
30. 


. In each case, determine whether p is irreducible in Z(t). 


5.1, Irreducibles and Unique Factorization 263 


. If R is an integral domain and peé R, show that p is irreducible if and only if 


(p) C (a), where a ¢ R*, implies that (p) = (a). 


(a) p=11 (b) p=2-% (c) p=5 (d) p=7-i 


. Let p € Z be a prime and assume that p= 3 (mod4). Show that p is irreducible in 


Z(t). [Hint: Corollary to Theorem 8 §1.3.] 


. In each case, determine whether p is irreducible in Z(./—5). 


(a) p=6+/-5 (b) p=7 

(c) p = 29 (d) p=2~—3V/-5 

In each case show that p is irreducible in Z(./—5) but is not a prime. 
(a) p=2+V-5 (b) pHl+ 2-5 


In each case, determine whether p is irreducible in Z(/—3). [Hint: Use the norm 
function N(m + nV/—3) = m? + 3n?,] 

(a) p=3+2V-3 (b) p=24+3V/-3 

(c) p=5 (4) p=7 

Show that 1+ /—3 is irreducible in Z(./—3) but is not a prime. 

Let p ~ q in the integral domain R. 

(a) Show that p is irreducible if and only if g is irreducible. 

(b) Show that p is a prime if and only if q is a prime. 

If p € Z(./—5), define N(p) as in Example 3. If N(p) is a prime in Z, show that p 
is irreducible in Z(/—5). 

Show that Z(/5) is not a UFD by showing that 1+ V5 is an irreducible that is not 
a prime. (Hint: Use N(m+nv5) = m? —5n?,] 

A commutative ring is said to satisfy the descending chain condition on principal 
ideals (DCCP) if (a) D (ag) D--- in R implies that a, ~ Q@n41 ~ ++: for somen>1 
(see Lemma 1). Show that an integral domain R satisfies the DCCP if and only if R 
is a field. 

Prove (2)=(1) in Lemma 1. 

Show that R has ACCP if and only if any nonempty family F of principal ideals of 
R has a maximal member. [(p) in F is called maximal in F if (p) C (a), with (a) in 
F, implies that (p) = (a).] 

Show that F[z] satisfies the ACCP for any field F by modifying the argument in 
Example 4. 

If S is a UFD and R is a subring, is R necessarily a UFD? Justify your answer. 
Assume that gcd(a, b,---) and Iem(a, b,---) exist in an integral domain. 

(a) Show that gcd(0,a,b,---) ~ gcd(a, b,---) and lem(0,a,b,---) ~ 0. 

(b) If u is a unit, show that lem(u,a,b,---) ~ lem(a,b,---) and gced(u,a,b,---) ~ 1. 
Ifa; ~ b; in R fori =1,2,--- ,n, show that when they exist, 

(a) ged(a1,...,@n) ~ gcd(bi,..., bn). 

(b) lom(ay,...,@n) ~ Iom(b1,...,6n). 

Show that gcd(ba1,...,ban) ~ 6+ gcd(ai,...,@n) in R whenever both gcds exist in 
Rand b#0. This extends Lemma 4. 

Show that gcd[a, gcd(8, c)] ~ gcd[gcd(a,b),c] whenever all the gcds exist in R. 
Moreover, show that this common value is gcd(a, b,c). 

If ged(a, b) ~ 1 ~ ged(a,c) in a UFD, show that ged(a, bc) ~ A 

In a UFD, if albe and gcd(a, b) ~ 1, show that alc. 

In a UFD, show that gcd(a, b) lem(a, b) = ab for alla #0, b #0. 


264 
31. 
32. 


33. 
34. 


35. 


36. 


37. 


38. 


39. 


40. 


5.2 


5. Factorization in Integral Domains 


Show that Icm(a,,...,@n) exists in an integral domain R if and only if the 
intersection (a1) M+::M (an) is a principal ideal. 

Show that lcm(da, db, de,...) ~ d lem(a, b,c,+--) ina UFD for all nonzero d,a, b,c,.... 
Prove Lemma 5. [Hint: Exercise 26.] 

Let R be a subring of an integral domain S$ such that (1) R* = S*, and (2) ifseS 
and s|r, r € R, then s € R. (For example, 5 = R{z].) 

(a) Show that p € RF is irreducible in R if and only if it is irreducible in S. 

(b) If S is a UFD, show that R is a UFD. 

(c) Prove the converse of Theorem 10: If R{z] is a UFD, then R is a UFD. 

Let R be a UFD and let g|f in R[z], where f #0. If f is primitive, show that g is 
also primitive. 

Show that an integral domain R is a UFD if and only if it satisfies the ACCP and 
lcm(a, b) exists for all a #0, b #0 in R. [Hint: If plab, p irreducible, consider m ~ 
lem(a, p). Use the fact that mlap and mlab.] 

Let R be a UFD with field of quotients F, and let f, g € Ria]. If f and g are primitive, 
show that f ~ g in R[z] if and only if f ~ g in F[z]. 

Let R be a UFD with field of quotients F. If p € R[z] is primitive, and p is irreducible 
in F[z], show that p is irreducible in R[z]. 

(P.M. Cohn) Fix a prime p in Z and let R denote the set of all polynomials in Z[z] 
with the coefficient of x divisible by p. 

(a) Show that R is an integral domain. 

(b) Show that gcd(p,px) = 1 in R but that lem(p, px) does not exist in R. 

(T.W. Hungerford) Let R denote the set of polynomials f in Q[z] with constant 
coefficient in Z. 

(a) Show that R is an integral domain, R* = {1,—1}, and R is not a UFD. [Hint: 
Consider «, $&, ;2, a2 ve 

(b) Show that f € R is irreducible if and only if f is one of two types: (1) f ~p, 
p a prime in Z; (2) f ~h, h irreducible in Q[z], of positive degree, with constant 
coefficient 1. 

(c) Show that each irreducible in R is prime. 

(d) Show that gcd(f,g) exists in R for all f 4 0, g #0 in R. [Hint: Consider whether 
lf or alg. 

(e) If f £0 in R, show that f=tr™h,---h,, where t€ Q, n>0, and each h, is 
irreducible in Q with constant coefficient 1. Moreover, if f = t/a” hi, -»-h’, is another 
such representation, show that t=?t’, n =n’, r=s, and (after relabeling) h; = thi, 
for all 4. 


PRINCIPAL IDEAL DOMAINS 


Theorem 3 §5.1 shows that an integral domain satisfying the ascending chain 
condition on principal ideals has the property that every nonzero nonunit factors 
into irreducibles. It turns out that the following family of integral domains all have 
this property. 


An integral domain R is called a principal ideal domain (PID) if every ideal 


is principal. 


Example 1. Z is a PID. Indeed, every additive subgroup of Z is cyclic of the form 
(m) = mZ for some m € Z. 


5.2. Principal Ideal Domains 265 


Example 2. If F is a field; F'[z] is a PID by Theorem 1 84.3. 


Principal ideal domains are quite common. For example, every ring R such that 
ZC RC Qis a PID (see Exercise 10). In addition, we will also show that the ring 
Z(t) of gaussian integers is a PID. 

One of the most useful facts about Z (and about F'[z], where F' is a field) is 
that any two elements have a greatest common divisor that is a linear combination 
of them. This is true in every PID. 


Theorem 1. Let R be a PID, and let a;, a2,...,a, be nonzero elements of R. 
Then d ~ gced(ai,...,@,,) exists, and there exist r1,...,% € R such that 


ged(a1,...,@n) = 71Q, +++++Tndn. 


Proof. Let A = {ria, +++++1nGn | 7; € R} denote the set of all linear combinations 
of the a;. Then A is an ideal of R as is easily verified, so A = (d) for some dE R 
because R is a PID. Thus, d= ria; +-:-+ 7nd, for some r;, and we claim that 


d~ gcd(a1,...,@n). We have dla; for each 7 because a; € A. But if rla; for all 4, 
then r clearly divides d = rjay + r2a2 +--+ +TnaQn. Thus, d~ ged(aj,...,@n) by 
the definition of the gcd, which proves Theorem 1. | 


The next theorem reconfirms the fact that Z and F'[x] are UFDs. 
Theorem 2. Every PID is a UFD. 
Proof. If R is a PID, it suffices to verify the ACCP by Theorem 1 and Theorem 


7 §5.1. If (ai) C (ag) C--» in R, put A= (a;) U (ag) U---. Then A is an ideal 
(verify), so let A = (a) by hypothesis. Thus, a € (a,) for some n, so 


(a) C (an) © (angi) G++» CA = (a). 


Hence, (an) = (@n41) = +++, as required by Lemma 1 85.1. a 
The converse of Theorem 2 is false as the next example shows. 

Example 3. Show that Z{x] is a UFD that is not a PID. 

Solution. Z[{z] is a UFD by Theorem 10 §5.1. Let A= {2n+af |neZ, f € Z[z}}. 

Then A is an ideal of Z[z], which we claim is not principal. For if A= (gq), 


g € Zz], then g|2 because 2€ A, so g=+1 or g= +2. But then A=Z[a] or 
A = (2), respectively, and both these possibilities are false. 0 


If R is a UFD, the least common multiple exists for any finite set of nonzero 
elements of R (Theorem 6 85.1). In a PID we can say more. If Aj, Ag,...,An are 
ideals of any ring R, their sum is defined by 


Ay +Agt-:++An = {01 +22 4+-+++2n | 2; € A; for all i}. 


This is easily verified to be an ideal of R containing each A;. In particular, if 
@1,02,...,@,, are nonzero elements of FR, then 


(a1) + (aa) +++ (Qn) = {ria + red2 +--+ +9nan |e € RB} 


266 5. Factorization in Integral Domains 


is the ideal considered in the proof of Theorem 1. Hence, that proof shows that (1) 
below is true. The dual holds too: 


(1) d~ ged(ai,..., an) ifand only if (ay) + (ag) +++ + (an) = (d). 
(2) m~lem(a1,...,@n)  ifandonly if — (a1) N (a2) N+ +N (an) = (m). 


We leave verification of (2) to the reader (see Exercise 31 §5.1). 
The fact that Zp is a field whenever p € Z is a prime has the following analogue 
in an arbitrary PID. 


Theorem 3. The following are equivalent for a nonzero nonunit p in a PID R: 
(1) p is a prime. 
(2) R/ (p) is a field. 
(3) R/ (p) is an integral domain. 

In particular, every nonzero prime ideal of R is maximal. 


Proof. (1) = (2). Consider = a+ (p) in R/(p) and assume that « #0. Then 
a ¢ (p); that is, p does not divide a. Now A = {ra+sp|r,s € R} is an ideal of R, 
so, as R is a PID, let A= (d), d€ R. Then dip because p € A, sod~pord~1. 
But d~ p means that (p) = (d) =A, whence pla because a € A, contrary to as- 
sumption. So d ~ 1, from which A = (1) = R. In particular, 1 € A, say 1 = ba + sp. 
If y = b+ (p), then zy = 1 in R/ (p) , proving (2). 

(2) = (3). Every field is an integral domain. 

(3) = (1). Suppose that plab in R; we must show that pla or plb. Now plab 
implies that (a + (p))(b+ (p)) = 0 in R/ (p) so, by (3), a+ (p) =0 or b+ (p) =0 
in R/ (p). Thus pla or p|b, proving (1). 

Finally, let A be a prime ideal of R, say A = (p), p € R. Then R/A is an integral 
domain, hence a field because (3)=(2). Thus, A is a maximal ideal of R by 
Corollary 1 of Theorem 6 §3.3, proving the last sentence of the theorem. | 


Theorem 3 shows that nonzero prime ideals in a PID are maximal. However, 
this property may fail in a UFD. For example, if R = Z[a], then R is a UFD by 
Theorem 10 §5.1. But (az) is a prime ideal of Z[] that is not maximal in Z[{z], 
because R/ (x) = Z is an integral domain that is not a field. 


Division Algorithms 


Another useful property of the integral domains Z and F[z], F' a field, is that both 
possess a division algorithm. This property leads to an interesting class of PIDs. 

In general, if R is an integral domain, we say that R has a division algorithm 
if a map 6: R—N exists (called a divisor function) such that the following 
condition is satisfied: 


DA Given a and b #0 in R, there exist q and r in R such that 
a=qb+r and either r = 0 or d(r) < 6(b). 


Example 4. If F is a field, 6(f) = deg f is a divisor function for F[z] by Theorem 
4 84.1. 


5.2, Principal Ideal Domains 267 


Example 5. Show that 6(a) = |a| is a divisor function for Z. 


Solution. Let a,b € Z where b#0. Then |b} >0, so a=q\|bl|+r by Theorem 
1 §1.2, where 0<r <6]. If b>0, this reads a=gb+r; if b<0, it becomes 
a= q(—b) +r = (—q)b+7r. Hence, DA holds in both cases. oO 


Theorem 4. Every integral domain R with a division algorithm is a PID. 


Proof. Let R be such a domain with divisor function 6, and let B be an ideal of R. 
If B = 0, then B = (0) is principal. Otherwise, let 0 4 b € B be such that 6(b) is as 
small as possible. Thus, (b) C B and we claim this is equality. Given a € B, write 
a= qb+r,wherer = 0or 6(r) < 6(b). But r = a — gb € B, so this is a contradiction 
ifr #0. Hence, r = 0, and soa = qb € (6) . It follows that B C (b), as required. @ 


Let R be an integral domain with a division algorithm. If a and b are nonzero 
elements of R, we can compute gcd(a, b) using the euclidean algorithm. Since the 
procedure is entirely analogous to that in Z (Section 1.2), we merely sketch it here. 
The idea is to use DA repeatedly as follows: 


a=mqmb+n, where 7, = or d(r1) < 6(b), 
b= qgritra, where rg = or 6(r2) < 6(r1), 
See d(1'9 ; 


Tr. = g3T2 +73, where r3=0 or  d(r3) 


Tm=-1 = Qm+1'm +tm4i, Where fm41=0 or O(tm4i) < 6(7m); 


Because 6(b) > 5(r1) > d(r2) > +--+ is a sequence of nonnegative integers, the pro- 
cess must encounter rm i= 0 at some stage where r, #0. Then rm|rm-1, so 
gcd(r'm-_1,Tm) ~ Tm- Now, as for Z (see Example 4 §1.2), we get 


gcd(a, b) ~ ged(b, 11) ~ ged(r1, re) ~ +++ ~ gcd(1m-1,Tm) ~ Tm: 


Thus, the algorithm produces gcd(a, b) as the last nonzero remainder. Finally, just 
as for Z, elimination of remainders in the preceding equations gives gcd(a,b) = rm 
in the form r, = ra + sb with r,s € R, as guaranteed by Theorem 1. 


Some Rings of Quadratic Integers 


We are going to show that the ring Z(i) of gaussian integers has a division algo- 
rithm. The method we use applies to other integral domains that resemble Z(i). 
For instance, Example 5 §5.1 shows that the integral domain Z(./—5) is not a UFD 
because it contains an element p = 1+ ./—5, which is irreducible but not prime. 
The study of such subrings of C was the source of the mathematics presented in this _ 
chapter, and it has now evolved into a subject in its own right, algebraic number 
theory. Note that ¢ and /—5 are roots of quadratic polynomials 7? + 1 and x? +5, 
respectively. Accordingly, we discuss subrings of C that result from adjoining a root 
of some quadratic z7 +m to Z, where m € C. 
Throughout the discussion, w denotes a complex number such that 


weEZ and w¢Q. 
The ring selected for study is 
Z(w) = {m+ nw |m,ne Z}. 


268 5. Factorization in Integral Domains 


Clearly, Z(w) is a subring of C and so is an integral domain. Moreover, the repre- 
sentation m+ nw of elements of Z(w) is unique in the following sense: 


Ifm+nw =m 4+ nw in Zw) then m=m' andn=n. 


Indeed, if m+ nw = m! + nw, then (n — n')w = m' —m. If n' # n, this gives w € Q, 
contrary to our assumption. Hence, n’ = n and so m/ = m. 

Most of what we have to say about Z(w) depends on two fundamental notions. 
Ifa=m-+nw € Z(w), define the conjugate a* of a and the norm N(a) of a by 
a®*=m—nw and N(a)=m? —wn?. 

Thus a* € Z(w), and N(a) € Z for all a € Z(w) because w” € Z. 


Example 6. Ifa = m+ niin Z(t), then a* = m — ni is the usual complex conjugate 
of a, and N(a) = m? + n? = |a|? is the square of the usual absolute value of a. 


Example 7. If a= m-+n/—5 in Z(./—5), then the conjugate is a* =m — n/—5 
and N(a) = m? + 5n?. This coincides with the usage in Example 3 §5.1. 


The next theorem collects several basic properties of norms and conjugates in 
Z(w). In the case of the gaussian integers Z(i), these properties reduce to familiar 
facts about the complex numbers. 


Theorem 5. Let w € C satisfy w” € Z, w ¢ Q. Then the following properties hold 
for all a and b in Z(w). 

aa* = N(a) = N(a*). 

(ab)* = a*b* and a* = a. 


Proof. (1) and (2). The routine verifications are left as Exercise 11. 

(3). By (1) and (2), N(ab) = (ab)(ab)* = aba*b* = aa*bb* = N(a)N(b). 

(4). If a is a unit, then (3) gives N(a)N(a~') = N(1) = 1, so N(a) = +1. Con- 
versely, if N(a) = +1, then a[N(a)a*] = N(a)? = 1 by (1). Thus, a7? = N(a)a*. 

(5). If a=m-+nw, then N(a) =0 means that m? —w?n? = 0. If n #0, this 
gives w = +(m/n) € Q, contrary to assumption. So n = 0, from which m = 0, and 
hence a = 0. The converse is clear. 

(6). If N(a) is a prime in Z, let a = be in Z(w). Then N(a) = N(b) N(c) in Z, 
so N(b) = +1 or N(c) = +1. Hence, 0 or c is a unit in Z(w) by (4). fa 


Note that the converse to (6) of Theorem 5 is not true: In Z(./—5), the element 
a= 1+ /—5 is irreducible (Example 5 §5.1) but N(a) = 6 is not prime. 

Of course, the units in Z(w) are of interest. For example, if a= m+n,/—2 is 
a unit in Z(/—2), then m? + 2n? = N(a) =+1 by Theorem 5. This easily shows 
that 1 and —1 are the only units in Z(./—2). In fact, this holds for Z(/—d), where 
d > 0 is any integer that is not a square. 

On the other hand, a=m-+nv/2 is a unit in Z(/2) if and only if the 
norm N(a) =m? — 2n? = +1. In particular, u = 1+ V2 is a unit in Z(./2) where 


5.2. Principal Ideal Domains 269 


u-t = -1+ V2. Hence, +u* is a unit for any k € Z. (In fact, these are all the units 
in Z(/2); see Exercise 31.) In this case, if d > 0 is any integer that is not a square, 
then m+nvV4 is a unit in Z(/d) if and only if 

m* — dn? = +1. 
This is sometimes called Pell’s equation, and solutions with m# +1 always 


exist.°9 Hence, Z(Vd) has a unit u# +1, so taking powers of u gives infinitely 
many solutions of Pell’s equation, an observation made originally by Fermat. 


Example 8. If d > 0 is a nonsquare integer, then Z(/d) has infinitely many units. 
We now turn to the factorization theory in Z(w). 


Theorem 6. Every nonzero nonunit in Z(w) is a product of irreducibles. 
Proof. By Theorem 3 §5.1, it suffices to show that Z(w) satisfies the ACCP; that 
is, a strictly increasing chain 0 C (a1) C (a2) C--+ in Z(w) is impossible. Suppose 
such a chain exists, Then a@n+41\@, for each n >1, say an = bnGn4i. Moreover, 
by, is not a unit because (an) # (a@n41), so |N(b,)| > 1 by Theorem 5. But then 
|N(an)| > |N(an41)| because N(an) = N(bn)N(an41), so |N(a1)| > |N(a2)| > -:- 
is a strictly decreasing sequence of nonnegative integers, a contradiction. Eo 
Theorem 6 notwithstanding, it is difficult to determine which choices of w make 


Z(w) into a UFD or a PID. We content ourselves with one condition that guarantees 
that Z(w) has a division algorithm. We need a technical lemma. 


Lemma 1. Assume that, for any r and s in Q, there exist m and n in Z such that 
I(r —m)? —w*(s —n)?| <1. 
Then 5(a) = |N(a)| defines a divisor function Z(w) > N. 


Proof. Given a and b #0 in Z(w), we must find r and g in Z(w) such that a = gb+r 
and either r = 0 or 6(r) < 6(b). Working in C yields 

ie cal en 

b bb* -N(b)” 
Hence, a/b has the form a/b = r+ sw, where r and s are in Q, so a= (r+ sw)b. 
Choose m and n as in the hypothesis, write q = m+ nw, and define 

r=—qb+a=(—m—nw)b + (r+ sw)b = [(r — m) + (s — n)w)b. 
Now observe that 6(2y) = 6(x) d(y) for all x,y € Z(w) by Theorem 5. Then 
5(r) = d[(r —m) + (s — n)w]6() = |(r — m)? — w?(s — n)?|5(b) < 4(b), 

which proves DA. O 
Theorem 7. The ring Z(i) of gaussian integers has a division algorithm with 


divisor function 5(a) = |N(a)| for all a € Z(i). That is (m+ ni) = m? + n? for all 
m, n€ Z. 


Proof. If r is a-rational number and m is the integer closest to r, it is clear that 
|r — m| < §. Similarly, let |s — n| < 3, 2 € Z. Thus, Lemma 1 applies because 


(7m)? P(e —n)? = (rm) + (sn)? <b 4b = 5. a 


6°For a detailed discussion using continued fractions, see Davenport, H., The Higher Arithmetic, 
New York: Harper, 1960. 


270 5. Factorization in Integral Domains 


Euclidean Domains 


In Theorem 7, the division function 6(a) =|N(a)| on Z(i) has the property that 
5(ab) = 5(a)d(b) for all a,b € Z(t). Since 6(a) = 0 if and only if a = 0 by Theorem 
5, it follows that 6(a) >1 whenever a#0, and hence d(ab) > 6(a) whenever 
a#0and b#0. This suggests the following definition: 

An integral domain R is called a euclidean domain if it has a division algo- 
rithm with divisor function 6 : R — N that satisfies DA and the following condition: 


E Ifa#0 andb#0 in R, then 6(ab) > d(a). 


Ezample 9. The ring Z is euclidean where 6(a) = |a| for all a # 0 in Z. 
Example 10. If F is a field, F[z] is euclidean if 6(f) = deg f for all f #0 in F[a]. 
Example 11. The gaussian integers Z(i) is euclidean if 6(m + ni) = Vm? + n?. 


Some PIDs are not euclidean, but examples are not easy to find. In 1949, T. 
Motzkin provided the first such example: Z($(1 + V/—19)) is a noneuclidean PID.”° 
The extra condition E gives more information about the euclidean ring in terms of 
the mapping 6. Example 12 characterizes the units in terms of 6. 


Example 12. Let R be a euclidean domain. If 0 # a € R, show that 6(1) < d(a) 
and that a is a unit if and only if 6(1) = (a). 


Solution. We have 5(1) < 6(1-a) = (a) by E. If a is a unit, 5(a) < 6(aa~') = 6(1), 
again by E, so 6(a) = 6(1). Conversely, if 6(a@) = 6(1), write 1 = qa +r, where r = 0 
or 6(r) < d(a). If r #0, we obtain 6(1) < 6(1-r) < 6(a) = 6(1), a contradiction. So 
r= 0,1 = qa, and a is a unit. Oo 

There is a lot of information available on these euclidean domains, but we will 
not pursue this here. Instead, given a and b # 0 in Z(i), we conclude by giving an 


example of how the technique used to prove Lemma 1 can be employed to actually 
find gq and r in Z(t) such that a = qb+r and either r = 0 or 6(r) < 6(b). 


Example 13. Let a= 7+ 8i and b = 2 —7 in Z(i). Find g and r in Z(2) such that 
a =qb+r and either r = 0 or d(r) < 6(b). 


Solution. The technique in Lemma 1 applies. Compute in C 


a _ ab _ (7+8i)(2+2) _ 64231 
6 bb Da rn 


Now the closest integers to 6/5 and 23/5 are 1 and 5, respectively. Hence, we write 
6/5 =1+41/5 and 23/5 = 5 ~ 2/5 to get 


@ = § + i = (145%) + (£— 23). 


“See Motzkin, T., Bulletin of the American Mathematical Society, 55 (1949), 1142-1146. See also 
Campoli, O.A., Americam Mathematical Monthly, 95 (1988), 868-871; Wilson, J.C., Mathematics 
Magazine, 46 (1973), 34-38; Williams, K.S., Mathematics Magazine, 48 (1975), 176-177. 


5.2. Principal Ideal Domains 271 


Thus 


so 


a = (1+ 54)b+ (2 — 2i)b = (1+ 54)b+ (0-3), 


qg=1+5i and r = —i. Note that 6(r) =1<5 =6(6). 


Ernst Eduard Kummer (1810-1893) Kummer entered the University of Halle at 
the age of 18 and within 3 years had a Ph.D. in mathematics. He became a professor at 
the University of Breslau in 1842, and in 1855 he succeeded Dirichlet at the University 
of Berlin. Kummer is best remembered as the creator, with Dedekind and Kronecker, of 
algebraic number theory. As described in the introduction to this chapter, Kummer was 
interested in Fermat's last theorem and was led to consider why the unique factorization 
into primes failed in Z(w), where w is a root of unity. His creation of ideal numbers, for 
which the uniqueness can be restored, has been compared to the creation of noneuclidean 
geometry. Its importance as a mathematical achievement stems from the fact that it 
led, via Dedekind, to the modern notion of an ideal. 


In addition to algebra, Kummer also made contributions to geometry, analysis, and 
physics. He was a popular lecturer and directed many Ph.D. students. In 1857, he was 
awarded the grand prize in mathematics of the French Academy of Sciences. 


Exercises 5.2 


10. 


11. 


. Is every subring of a PID again a PID? Support your answer. 
. If F is a field, show that F'[z,y] is a UFD that is not a PID. [Hint: Consider 
Show that every field F' is a PID. 

Is Z(/—5) a PID? Defend your answer. 


. If Risa PID and A ¥ 0 is an ideal of R, show that R/A has a finite number of ideals, 


all of which are principal. 


. (a) Is every prime ideal of a PID maximal? Support your answer. 


(b) Show that every ideal A # R in a PID R is contained in a maximal ideal of R. 
Show that the following conditions are equivalent for an integral domain R. 
(a) Ris a field, (b) R[x] is euclidean, and (c) R[a] is a PID. 


. Let p € Z be a prime and define Zp) = {™ € Q| p does not divide n}. 


(a) Show that Zp) is an integral domain (called the localization of Z at p) and find 
the units. 

(b) If A #0 is an ideal of Zp), show that A= (p*), where k > 0 is the smallest 
integer such that p* € A. [Hint: If 0 #m € Z, then m = p"d, where r > 0 and p does 
not divide d.] 

(c) Show that Zp) is a PID with exactly one maximal ideal. 


. Let Zp) be as in Exercise 8. Show that Zip) is a euclidean domain where, for each 


a#0 in R, 6(a) =k where (a) = (p*). Indeed, show that 5(ab) = 6(a) + 6(b) for all 
a#0,b#0 in Zp) and that, ifa+b 40, then 6(a+ b) > min{d(a), 6(b)}. 

Let R be a ring such that Z C R C Q: Show that R is a PID. [Hint: If I is an ideal of 
R, consider A= ZNI.] 

(a) Prove (1) and (2) of Theorem 5. 

(b) Prove that the converse of (6) in Theorem 5 is false. [Hint: In Example 5 §5.1, 
consider a = 1+ /—51] 


272 


12. 


138. 


14, 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22, 


23. 


24, 


25. 


26. 


27. 


28. 


29. 


5. Factorization in Integral Domains 


Let w be as in Theorem 5 and assume that w? < 0. Show that Z(w) has finitely many 
units. 
(a) Show that Z(./—2) is euclidean with 6(a) = |N(a)|. 
(b) Ifa =443/—2 and b = 3 — V—2, write a = gb+1r, where r = 0 or 6(r) < 6(b). 
(a) Show that Z(/2) is euclidean with 6(a) = |N(a)|. 
(b) Ifa=5+7V2 and b=3+4 v2, write a = qb+r, where r =0 or d(r) < 4(b). 
(a) Show that Z(/3) is euclidean with 6(a) = |N(a)|. 
(b) Ifa=44+5V3 and b=1+ V3, write a = qb+r, where r = 0 or d(r) < 6(8). 
Show that Z(/—8) is not euclidean with 5(a) = |N(a)|. [Hint: Try a = 1+ /—3 and 
b = 2.) 
If R is a euclidean domain, and if m > 0 and k are integers, show that 6’ satisfies DA 
and E, where 6'(a) = m- 6(a) +k. 
(a) If F is a field, show that F' is euclidean. 
(b) If the mapping 6 is constant in a euclidean domain R, show that FR is a field. 
Ifa~ bin a euclidean domain R, show that d(a) = 6(b). 
If alb and 6(a) = 6(b) in a euclidean domain R, show that a ~ b. 
Let b # 0 in a euclidean domain R. Show that 6 is a nonunit if and only if 6(ab) > 6(a) 
for alla #0 in R. [Hint: Exercises 19 and 20.] 
Assume that R is a euclidean domain in which 5(a+ 6) < max{6(a),6(b)} when- 
ever a,b, and a+b are nonzero. Show that gq and r are uniquely determined 
in DA. 
Suppose that a euclidean domain F has a unique maximal ideal P. Write P = (p) by 
Theorem 4. 
(a) Show that P consists of nonunits; that is, a is a nonunit if and only if pla. [Hint: 
Exercise 6.] 
(b) Show that every ideal A # 0 of R has the form A= (p*) for some k > 0. 
(a) If A= (1+7%) m Z(z), show that Z(i)/A is a finite field and find its order. 
(b) If A = (1+ 2%) in Z(2), show that Z(i)/A is a finite field and find its order. 
For w as in Theorem 5, show that Q(w) = {r+ sw | r,s € Q} is the field of quotients 
of Z(w). 
An ideal A of a commutative ring R is said to be finitely generated if we 
have A = {rja1 + red2 +++: +?nan | 7; € R} for some ai,a2,--+,@, in A. We write 
A = (a1,...,@,) in this case and say that a1,a2,...,@, generate A. 
(a) Show that the following conditions are equivalent for an integral domain R (then 
called a Bézout domain): 

(1) Every 2-generated ideal A = (a, 6) is principal. 

(2) Ifa #0 and b #0, then d = ged(a, d) exists and d= ra + sb for some 


r,sER, 
(b) If R is a Bézout domain, show that every finitely generated ideal is 
principal; in fact, for all a1,...,@, in R, show that d~ gcd(aj,...,@n) exists and 


that (a1,...,@n) = (d). 

Let R be an integral domain. Show that R is a PID if and only if it satisfies the 
ACCP and each 2-generated ideal (a,b) is principal. [Hint: Exercise 26.] 

Let R be a UFD. Show that R is a PID if and only if for alla #AOQandb#O0in R,r 
and s exist in R such that gcd(a,b) ~ ra+ sb. [Hint: Exercises 26 and 27.] 

Let a = bc in a PID R where ged(b,c) ~ 1. Show that ia ~ e x ot [Hint: Chinese 
remainder theorem, Theorem 8 §3.4.] 


30. 


31. 


32. 


33. 
34. 


35. 


36. 


5.2. Principal Ideal Domains 273 


Let R be a PID and let A be an ideal of R that satisfies the condition that r? € A, 
r € R, implies that r € A. Show that R/A is isomorphic to a finite direct product of 
fields. [Hint: Exercise 44 §3.4.] 

Show that every unit of Z(/2) has the form +u*, where ke Z and u=1+4 V2. 
[Hint: If v > 0 is a unit in Z(V/2), show that either v= u* for some integer k or 
u® <u < ut! for some k. Rule out the second case by showing that 1<v <u is 
impossible if v is a unit (v > 1 implies that -1 < v* < 1)|| 

For a= m+ nw in Z(w), define the integral part of a by inta =m. Then write 
(a,b) = int(ab*) for all a and 6 in Z(w). If w is as in Theorem 5, prove that the 
following hold for all a,b, and c in Z(w). 

(a) (a,b) = (b,2) (c) (a +b,¢) = (a, ) + (0,0) 

(b) (ka, b) = k (a,b) for alk eZ (da) (a,a) = N(a) 

Can the integral domain Z(./—2) be ordered (Section 3.5)? Defend your answer. 

(a) Show that 6: Z(w) — M2(Z) is a one-to-one ring homomorphism if 


bGnaenal= m a 
nm om 

(b) Show that N(a) = det[6(a)] for all a € Z(w). 
Let R = Z(w), where w is as in Theorem 5, and define r: R > R by r(a) = a* for all 
acR. 
(a) Show that 7 is a ring automorphism satisfying r? = 1p. 
(b) If o: R> R is a ring automorphism satisfying o? =1R, show that o =7 or 
c= Lp. 
If Risa PID and A # 0 is an ideal of R, show that every ideal of R/A is the annihilator 
of an element. [Hint: Every ideal of R/A has the form B/A. If A = (a), B = (b), then 
a= bc, c € R. Show that B/A = ann(c+ A),] 


Chapter 6 
Fields 


There is astonishing imagination, even in the science of mathematics.... We repeat, there 
was more imagination in the head of Archimedes than in that of Homer. 


—Voltaire 


Human beings have sought solutions to algebraic equations for centuries. This 
search has inspired some of the most creative (and important) mathematics imag- 
inable. Suppose that a primitive tribe, motivated by the desire to count things and 
to tell others the results, has developed a facility with the set N = {0,1,2,...} of 
natural numbers to the point where they can add and multiply. Then they can 
solve certain equations: for example, 7+3=7 has the unique solution x = 4. 
However, they declare that, despite the efforts of their finest mathematicians, the 
equation z +3 = 2 has no solution. We, of course, know that they have an inad- 
equate number supply and are not aware of the existence of the negative integers. 
To put it another way, they have invented a system N of numbers that is adequate 
for ordinary counting, but they must invent a larger number system Z to be able 
to solve the equation «+a = 6b for any a and b in N. 

Of course, Z is also inadequate. For example, 32 =5 has no solution in Z, 
and the set Q of rational numbers must be invented to solve equations of the form 
av = 6, Again, the equation x? = 2 has no solution in Q and so the (much) larger 
set R of real numbers is needed. Even R is deficient: x? = —1 has no solution in 
R, which leads to the invention of the set C of complex numbers. This step is in a 
sense the end of the story because, thanks to Gauss, we know that f(z) = 0 has a 
solution in C for every polynomial f with coefficients in C. 

Although these number systems did not quite evolve in this way historically, 
the pattern is clear. When faced with an algebraic equation with no solution in a 
known number system, the idea is to invent a larger number system that contains 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


274 


6.1. Vector Spaces 275 


a solution. This process of adjoining solutions plays a major role in field theory. 
Let F be any field and let f be a polynomial in F(z] that has no root in F. Then a 
larger field & containing F' can be constructed that contains a root of f. Moreover, 
by repeating the process, we can find a field K containing F' such that f factors 
completely as a product of linear factors in K [a]. Finally, the smallest such field (in 
a suitable sense) is uniquely determined by F' and f and is called the splitting field 
of f over F. We carry out this construction in this chapter and use it, among other 
things, to completely classify all finite fields. 


6.1 VECTOR SPACES 


Order and simplification are the first steps to the mastery of a subject. 


—Thomas Mann 
Consider the following system of linear equations: 
ax + by =0 


, b,c, d real. 
er aerate eer 


Because the system is homogeneous (both constants on the right are zero), the set 
of solutions to the system has an algebraic structure. More precisely, if both 


L} oa [e] 
¥ YL 
are solutions, then for any real number k, the sum and scalar product 
SleGl- Gia) eG]=E 
y YI ytyn y ky 
are also solutions. In fact, the set of all solutions is an additive abelian group, and 
any such group with an appropriate scalar multiplication defined on it is an example 
of a real vector space. These vector spaces are the chief objects of study in linear 
algebra—the sister subject of abstract algebra. 

In linear algebra, matrices and vector spaces over the real numbers are defined, 
and concepts such as basis and dimension are introduced. Most of this theory can 
be developed in the same way with an arbitrary field F replacing R throughout, 
and some of this is needed in this chapter. However, a course in linear algebra is not 
a prerequisite to the present discussion, and this section develops just enough 
of the theory for the applications to fields that follow. If the reader is familiar with 
real vector spaces, a glance at this section will probably suffice before proceeding 
to Section 6.2, 


Vector Spaces 


If F is any field, a vector space V over F' is an additive abelian group such that 
for alla € F and v in V, an element av in V is defined (called the scalar multiple 
of v by a) that satisfies the following conditions for all a,b € F and allv,w€V: 


V1 afv+w) =av+aw. 
V2 (a+bju=av+tobv. 


276 6. Fields 


V3 a(bv) = (ab)v. 
V4 lv=uv. 


The elements of V and F are called vectors and scalars, respectively. To emphasize 
the field of scalars, we call V an F-space and denote it V = pV. 

Of course, we adopt all the conventions about an additive abelian group V: The 
unity is called the zero of V, denoted 0, and the inverse of a vector v is denoted 
—v and is called the negative of v. 


Example 1. If F is a field, F” = {(a1,...,@n) | a; € F} is a vector space with 
the usual componentwise addition and scalar multiplication: 


(a1,.-.-,;Qn) + (b1,...,6n) = (a1 + b1,...;€n + bn), 
R(Qiyexes Ga) = (Bay at han): 


If n=1, F! = pF is a vector space over itself. When it is more convenient, we 
write the n-tuples in F” as columns rather than rows. O 


Example 2. If F is a field, the set Mn(F) of all n x n matrices over F' is a vector 
space with the usual matrix addition, and scalar multiplication a[ai;{[= [aa,;] for all 
a € F and [a,;[€ M,(F). 


Example 3. Let R be any ring that contains a field F as a subring. Then R= rR 
is a vector space using the addition and multiplication of R. Thus, C is an R-space, 
and we refer repeatedly to the case where R itself is a field. Also, F'[x] is an F-space 
where F is identified with the subring of constant polynomials. 


Example 4. If F is any field, the additive group {0} is a vector space over F' if we 
define a: 0 = 0 for all a € F’. It is called the zero space and denoted 0. 


Theorem 1 collects several frequently used facts about vector spaces V over a 
field F. When no confusion can result (which is nearly always), we use the symbol 
0 for both the zero of the field F and that of the additive abelian group V. 


Theorem 1. Let V be an F-space where F is a field and letae F andve V. 

(1) 0v =0 and a0 = 0. 

(2) av = 0 if and only ifa=0 in F orv=0 in V. 

(3) (—l)v = -v. 

(4) (—a)v = —(av) = a{—v). 
Proof. Ov = (0 + 0)v = 0v + 0v by axiom V2, so Ov = 0. Similarly, V1 gives a0 = 0, 
proving (1). If av = 0 and a #0, then v = lu = (a1a)v = a} (av) = a0 = 0 by 
(1). With (1), this proves (2). As to (3), (—1)v + v = (—1+1)u = 0u = 0 by (1), so 
(—1)v is the additive inverse of v. This gives (3); (4) is left as Exercise 5. El 

A subset U of a vector space -V is called a subspace of V if U is itself a 
vector space using the addition and scalar multiplication of V; in other words, U is a 


subgroup of V that is closed under scalar multiplication (au € U for all a € F and 
u € U). Theorem 2 is the analogue of the subgroup test. (The proof is Exercise 6.) 


Theorem 2. Subspace Test. A nonempty subset U of a vector space rV is a 
subspace if and only if it is closed under addition and scalar multiplication. 


6.1. Vector Spaces 277 


Example 5. If rV is a vector space, V and {0} are subspaces of V. 


Example 6. If 7V is a vector space and v € V, write Fu = {av|a€ F}. This is 
easily verified to be a subspace of V (using Theorem 1). 


Example 7. If A is any matrix in M,,(F), show that U = {ue F” | Au=O} isa 
subspace of F”. (Here, vectors in F” are written as columns.) 


Solution. If u,v € U, then A(u+v) = Au+ Av=04+0=0, so u+veEU and 
A(ku) = k(Au) = k0 = 0, k € F, so ku € U. Hence, the subspace test applies. O 


Spanning and Independence 


The most important way to describe subspaces of a vector space rV is to use the 
following notion: If v1,...,v, are vectors in V, a vector of the form 


AzV, +++ + GnUn where a; € F for all 
is called a linear combination of the v;. The set of all such vectors is denoted 
span{v1,...,;Un} = {a1v1 +++» + GnUp | a; € FY}. 


We can easily verify that this is a subspace of V that contains each of the vectors 
V1, V2,+++;Un- Moreover, span{v1,...,Un} is the smallest subspace of V containing 
each v; in the sense that if U is any such subspace, then span{v,...,Un} CU. 

We use the following terminology. Let v1,...,Un be vectors in a vector space 
rV. Then span{v1,...,Un} is called the subspace of V spanned (or generated) 
by these vectors. We say that V is finite dimensional if 


V =span{v1,...,Un} for finitely many vectors v1,..., Un. 
In this case, we say that the vectors v1,...,Un are a spanning set for V. 
Example 8. If F is a field, show that the space F'[z] is not finite dimensional. 


Solution. The degree of any nonzero polynomial in span{fi,..., fr} cannot exceed 
the maximum of the degrees of the (nonzero) f;. Since F'[x] contains polynomials 
of arbitrarily large degree, it is impossible that F'[x] = span{fi,..., fr}. O 


If V = span{v,..., Un}, then every vector v in V can be written in at least 
one way as a linear combination of the vectors v1, v2,...,Un. The spanning sets 
for which this happens in ezactly one way for every v in V are of fundamental 
importance. In particular, the trivial linear combination 


Ovy + Ove +--+ 0u, =0 


is certainly one way to express the zero vector as a linear combination of the %, 
and it turns out to be enough to insist that this is the only way to do it. 

With this in mind, a set {v1, v2,..., Un} of vectors in a vector space rV is called 
linearly independent (or simply independent) if 


11 + Gada ts+++AnUn =0, a CF implies that Q, = 02 = =O, = 0. 


A set of vectors that is not independent is called dependent. 


i 
{ 


278 6. Fields 


Example 9. If 2#0 in the field F, show that {(1,1),(1,—1)} is independent in 
F?, whereas {(1, 2), (1, —1), (0,1)} is dependent. 


Solution. If a(1,1) + 6(1, -1) = (0,0), equating first and second components gives 
a+b=Oanda—b=0.As2+# 0 and F isa field, the only solution is a = b = 0, so 
the linear combination is the trivial one. However, —(1, 2) + (1, -1) + 3(0, 1) =(0, 0) 
shows that {(1, 2), (1,—1), (0,1)} is dependent. O 


The zero vector cannot belong to any independent set (by Theorem 1). On the 
other hand, Theorem 1 gives 


Example 10. Given v € V, {v} is independent if and only if v # 0. 


A set of vectors in a vector space V is called a basis for V if it is linearly 
independent and also spans V. 


Example 11. In the vector space F'”, consider the vectors 
p= (0.1050); 69: (0, Lacs O)p gs en = (0 0)9 51) 


in F”. We have (a1, @2,...,@n) = a1e1 + G2e2 +++: +Gn€n for all a; € F. It follows 
that {e1,€2,...,€n} is a basis of F”, which is called the standard basis. 


Theorem 3. If {v1,..., Un} is a basis of pV, then every vector v in V has a unique 
representation as a linear combination v = a101, +-+++@nUn, a; € F. 


Proof. Such a representation exists because V = span{v1,..., un}. If we have two 
expressions UV = 4101 +:+:+@nUpn and v = byv, +:+++bnvp, for v, then 


0=v-—v= (a, — bi)ur +--+ + (Gn — bn) Un. 
Hence, the independence of {v1,..., vn} guarantees that a; = b; for each i. a 


If F is a finite field with |F| = gq, Theorem 3 shows that a vector space rV 
with a basis {v1,..., Un} of mn vectors has exactly g”™ elements. In fact, in forming a 
typical vector v = a1; +-+-++@nUy in V, there are g choices for each coefficient a;, 
and Theorem 3 guarantees that each series of choices produces a different vector in 
V. We make use of this fact in Section 6.4 on finite fields. 

Note that if V has another basis {w1,..., Wm}, then, similarly, |V| = q’. Hence, 
gq” =|V| = q™ and so n =m. In other words, the number of elements in any two 
bases of V is the same. In fact, this remains true even for arbitrary fields and leads 
to the fundamental concept of dimension. We now turn to a proof of this basic fact. 


Dimension 


The concept of basis is fundamental to the theory of vector spaces, and we develop 
the most important properties of bases in this section. The key result is 


Theorem 4. Fundamental Theorem. Suppose that V = span{v1,...,Un} is a 
vector space and that {ui,...,Um} is an independent subset of V. Then m <n. 


Proof. We assume that m > n and show that this leads to a contradiction. Because 
V =span{v,...,Un}, write uw =a1v1 + +++ + GnUn. As uy +0, not all the a; are zero, 
say a, # 0 (after relabeling the v;). Then V = span{w1, v2,...,Un} by Exercise 21. 


6.1. Vector Spaces 279 


Now write ug = 611 +av2+:+:+anUn. Then some a; is nonzero because {u1, uz} is 
independent (by Exercise 22), so V = span{u1, U2,U3,...,Un} as before. As m > n, 
this procedure continues until all the vectors v1,...,U, are replaced by wj,..., tn. 
In particular, V = span{uj,...,u,}. But then u,41 is a linear combination of 
U1,-.+,Un, a contradiction because {u1,...,Um} is independent. |_| 


If V = span{vi,..., Un}, and if {w1,...,Um} is independent in V, the proof of 
Theorem 4 shows that not only m<n but also m of the (spanning) vectors 
U1,+++)Un can be replaced by the (independent) vectors ui, ..., Um and the resulting 
set will still span V. This result is called the Steinitz exchange lemma. 

The first consequence of Theorem 4 is that the number of vectors in a basis of 
a vector space V is an invariant of V; that is, it is the same for any basis. 


Theorem 5. Invariance Theorem. If {ui,...,um} and {v1,...,Un} are two 
bases of a vector space V, then m = n. 


Proof. We have m <n by Theorem 4 because {uj,...,t%m} is independent and 
V = span{v1,..., Un}. Interchanging the u,; and v; gives n <m, som=n. gi 


Hence, if a vector space V #0 has a basis {v1,...,Un}, the integer n does not 
depend on the choice of basis, and n is called the dimension of V and is denoted 
n= dim V. 


The dimension of the zero space is defined to be 0. This is equivalent to regarding 
the zero space as having an empty basis and is consistent with the fact that the zero 
vector cannot belong to any independent set. Hence, the statement that dimV =n 
if and only if V has a basis of n vectors holds even if n = 0. 


Example 12. dim pC = 2 because {1,7} is a basis. 

Example 18. vl | : E | ; E 2] ; ls |} is a basis of M2(F), so 
dimp M2(F) = 4. Similarly, dim pM,(F) = n?. 

Example 14. Ifn> 1, then dim fF” = n by Example 11. 

Example 15. Consider the subspace V = span{1,z,27,...,2"} of F[z]. Then 


dimV =n-+1 because {1,2,2”,...,2"} is independent by the definition of the 
indeterminate x in Section 4.1. Hence dim(F'[z]) is not finite by Theorem 4. 


The second consequence of the fundamental theorem is that any finite dimen- 
sional vector space has a basis. We need Lemmas 1 and 2 (Exercises 24 and 25). 


Lemma 1. Let {v1,...,Un} be an independent set in a vector space V. Ifv EV, 
then {v,v1,...,Un} is independent if and only if v ¢ span{v1,..., Un}. 


Lemma 2. A set of vectors is dependent if and only if one of them is in the span 
of the rest. 


Theorem 6. Let V #0 be a finite dimensional vector space, say V is spanned by 
nm vectors. 

(1) V has a finite basis and dimV <n. 

(2) Each independent subset of V is part of a basis. 

(3) Each finite spanning set for V contains a basis. 


280 6. Fields 


Proof. (1) Because V has a finite spanning set by hypothesis, it has a finite basis 
by (3), which is proved below. Because the basis is independent, dimV <n by 
Theorem 4. ; 

(2) Let {v1,...,ve} be independent in V, a finite set by the fundamental 
theorem. If span{v),...,vz} = V, the proof is complete. Otherwise, there exists 
vert EV, with upg ¢ span{vi,...,ue}. Then {v1,...,%,ve41} is independent 
by Lemma 1, which completes the proof if V = span{vj,...,v%,Ue+1}. If not, 
repeat the process. Thus, either the proof is complete at some stage or the process 
constructs arbitrarily large independent subsets of V. But this is impossible by the 
fundamental theorem, because V is spanned by n vectors. 

(3) Let V = span{v1,...,Un}. If {v1,..., un} is independent, there is nothing 
to prove. Otherwise, one of these vectors lies in the span of the rest by Lemma 2; 
relabeling if necessary, let v; € span{v2,...,Un}. Then V = span{ve,..., Un}, so the 
proof is complete if {ve,...,Un} is independent. If not, repeat the process. If a basis 
is encountered at some stage, the proof is complete. Otherwise, we ultimately reach 
V = span{v,}. But then {vp} is a basis because vu, # 0 (V #0 by hypothesis). 


Parts (2) and (3) of Theorem 6 reveal a useful property of a vector space V: 
If dimV =n, a set B of exactly n vectors in V is independent if and only if it 
spans V (Theorem 7). The advantage of this is that it eliminates the need to verify 
one or the other of these properties when we are checking that B is a basis of V. 


Theorem 7. Let V be a vector space with dimV =n and let B CV. be a set of 
exactly n vectors. If B is independent or spans V, then B is a basis of V. 


Proof. If B is independent and does not span V, then B is part of a basis of more than 
n vectors by Theorem 6, which contradicts Theorem 5. Similarly, if B spans V and 
is not independent, then B contains a basis of fewer than n vectors by Theorem 6, 
again contrary to Theorem 5. | 


Example 16. Let a be an element of a field F and let n > 0. Given f in F[z], 
with deg f <n, show that ao,a,,...,@n exist in F’ such that 


f =a9 + a1(@ — a) +--+ +Gn(x— a)”. 


Solution. Let V =span{i,az,...,2"} in Fa]. If B= {1,(#—a),...,(e—a)"}, 
then B C V and we show that B spans V. As dim V = n + 1, it suffices by Theorem 7 
to show that B is independent. Suppose that r9 + r1(@ — a) +--+ 4+ 7n(a — a)" =0 
in F[az], r; € F. Then r,, is the coefficient of 2” on the left-hand side, so rn = 0. Next 
fn-1 = 0 in the same way, and we continue to get r; = 0 for all i. 0 


We conclude this section with a theorem relating the dimension of a vector space 
V to the dimensions of its subspaces. 


Theorem 8. Let V # 0 be a vector space, dim V = n, and let U C V be asubspace. 
(1) U has a basis and dimU < n. 
(2) IfdimU =n, then U =V. 
(3) Every basis for U is part of a basis for V. 


Proof. (1) If U=0, then it has an empty basis and dimU =0. If U +0, let 
u1 # Oin U. If span{u;} = U then {uy} is a basis of U. Otherwise, the construction 


6.1. Vector Spaces 281 


in the proof of Theorem 6(2) either produces a basis for U or creates arbitrarily 
large independent subsets of V. The latter outcome cannot happen by Theorem 
4 because V is spanned by n vectors. Hence, U has a basis {u1,...,Um}. Then 
dim U =m, and m < n again by Theorem 4. 


(2) If dimU =n, any basis {u1,...,un} of U is a basis of V by Theorem 7. 


Thus, U = span{uy,...,uUn} = V. 


(3) This follows from Theorem 6. Al 


Exercises 6.1 


Throughout these exercises, F' denotes a field. 


1. 


wo 


o>) 


Which of the following are subspaces of F°? Support your answer. 
(a) U = {(a,b, 1) | a,b in F} 

(b) U = {(a,b,c) |a— 264+ 3c = 0} 

(c) U = {(a,b—1,c) | a,b,c in F} 

(d) U = {(2a+6,b—c,3b+a) | a,b,c in F} 


. Which of the following are subspaces of F'[z]? Support your answer. 


(a) U = {f | f(2) = 0} (b) U={2f| f € Fla}} 
(c) U = {f | deg f < 3} (d) U={f | f(3) =1} 


. Show that F? = span{(1,1,0), (1,0, 1), (0,1,1)} provided that 24 0 in F. 
. (a) Show that span{u,v,w}=span{u+v,u+w,v+w} in any vector space pV 


where 2 # 0 in F. 
(b) Is (a) true if F = Z.? Support your answer. 


. Prove (4) of Theorem 1. 
. Prove the subspace test (Theorem 2). 
. Which of the following are independent in V? Support your answer. 


(a) {(1, 2,3), (4,0,1), (2,1,0)} in V= Zz 

(b) {(1,0, 1,0), (1,1, 0,0), (0, 1,0, 1), (0,0, 1,1)} in V = F4 
(c) {a? +1,2+1,c} in V = Fla] 

(d) {2? —2+1,20?+24+1,2-1} in V = Fle] 


. Given A= [: 1 in Mo(F), show that A is invertible if and only if {(a,b), (c, d)} 


is a basis of F?. 


. (a) Show that {1, /2, V3} is independent in R over Q. 


(b) Show that {1, V2, V3, V6} is independent in R over Q. 
[Hint: (e+ dvV/2)(c — dV2) = c? — 2d? 


. Find a basis of R? containing v = (1, —2) and a basis not containing v. 

. Find infinitely many bases of R® containing v = (1, —1,0) and w = (1,1, 1). 

. Find all values of r for which {(2,r, 1), (1,0, 2), (0,1, —2)} is independent in R°. 

. Suppose that f and g in F[z] satisfy f(a) =0=9(b), f(b) #0, and g(a) #0 for 


some fixed a and b in F. Show that {f,g} is independent in F'[z]. 


. Show that {f1, fo,..., fn} is independent in F'[x] whenever deg f,, deg fo,..., deg fn 


are distinct. [See solution to Example 16.] 


. If Aisa 2x 2 matrix in Mo(F), show that agI + a,A + a2A? + a3 A? + a4A* = 0 for 


some a; € F, not all zero. 


282 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 
26. 


27. 


28. 


29. 


30. 


$1. 


6. Fields a 


If {A,,A2,...,Ax} is linearly independent in M,,(F), and if U is invertible, show 
that span{A,U, AgU,..., A;U} has dimension k. 

Let {Ai,...,A,} in M,(F) be such that, for some column v #0 in F", Ayu =0 
for each i. Show that span{A1,..., Az} #4 Mn(F). 

If X = {1,2,...,n}, let D, ={f | f:X — F is a mapping}. If f,ge D, andae F, 
define the pointwise sum f +g and scalar product af by (f + 9)(x) = f(x) + 9(z) 
and (af)(x) =a f(x) for all z € X. Show that D,, is a vector space over F' and that 
dim D, =n. 

Let R be a ring such that the field F C R is a subring. Assume that dim(rR) =n. 
(a) If r € R, show that p(r) = 0 for some polynomial p # 0 in F[z], with degp < n. 
[Hint: Can {1,r,r?,...,r"} be independent?] 

(b) If R is an integral domain, show that it must be a field. 

Let FC E be fields with dim rE =n. If eV is a vector space over E (and hence 
over F'), and if dim ~V = ™, show that dim pV = mn. 

If w= azv; + agve++:++GnUn in a vector space V, and if a, #0, show that 
span{vi, V2,...,Un} = span{u, v2,...,Un}- 

(a) Show that every subset of an independent set of vectors is independent. 

(b) Show that a set of vectors is dependent if it contains a dependent subset. 


(a) Show that an independent set {u1,...,u,} in eV with nm maximal is a basis. 
(b) Show that a spanning set {v1,...,U,} of eV with n minimal is a basis. 
Prove Lemma 1. 


Prove Lemma 2. 

Let U and W be subspaces of a finite dimensional vector space V. 

(a) Show that U + W = {u+w|ueU,w € W} is a subspace of V. 

(b) If UN W =0, show that dim(U + W) = dimU +dimW. 

(c) In general, show that dim(U + W) = dimU + dim W — dim(UNW). 

[Hint: By Theorem 6(2), extend a basis of UN V to bases of U and of W] 

If U and W are finite dimensional subspaces of V, show that U-+W (defined in 
Exercise 26) is finite dimensional. 

A polynomial p in F'[z] is called even if p(—x) = p(z), and p is odd if p(—x) = —p(2). 
Let V = span{1,2,z”,...,2"} and let U and W denote, respectively, the sets of even 
and odd polynomials in V. Assume that 24 0 in F. 

(a) Show that U and W are subspaces of V such that UNW =0 and U+W=V 
(see Exercise 26). 

(b) Find dim U and dim W. 

Let U be a subspace of a vector space V with dimU =m and let v € V. Given 
W = {ut+av|uceU,a€ F}, show that W is a subspace of V and that dimW =m 
or dimW =m+1. 

If U is a subspace of a vector space pV, define a scalar multiplication on the (additive) 
factor group V/U by a(v + U) =av + U. Show that V/U is a vector space and that if V 
is finite dimensional, then V/U is finite dimensional and dim V/U = dim V — dim U. 
A linear transformation y: rV— rW is a map such that y(v + w) = y(v) + y(w) 
and y(av) = ay(v) for alla € F and all v,we V. 

(a) Show that ker y and imy are subspaces of V and W, respectively. 

(b) If V is finite dimensional, show that im g is also finite dimensional. 

(c) If V is finite dimensional, show that dim V = dim(ker y) + dim(imy). (This is 
called the dimension theorem.) [Hint: Extend a basis {u,,...,U,} of kery to 
a basis {t,..., Um) Viy-++) Ug} of Vi] 


6.2. Algebraic Extensions 283 


32. Vector spaces pV and -W are called isomorphic (written V & W) if a one-to-one, 
onto linear transformation V — W exists (see Exercise 31). If pV has dimension n, 
show that V & F”, 


6.2 ALGEBRAIC EXTENSIONS 


Much of field theory concerns the relationship between two fields F and E, with 
ED F. Of course, this is taken to mean that F is a subring of E, and in this case, 
F is called a subfield of E and FE is called an extension field of F' (or simply 
an extension of F'), Everything we do relies on the fact that H = rE is a vector 
space over F’ using the operations of £. If the vector space -E has finite dimension, 
then EF is called a finite extension of F, and we write dim rE = [E: F). 


Example 1. C 2D R is finite, and [C : R] = 2 because {1,7} is an R-basis of C. 
Example 2. We demonstrate later that R D Q is not a finite extension. 


Theorem 1. Let EDF be a finite extension with [E: F]=n. If uc E, a 
polynomial f + 0 in F'[x] exists such that deg f <n and f(u) = 0. 


Proof. The n+1 elements 1,u,u?,...,u” of E cannot be F-independent because 
dim rE =n (Theorem 4 §6.1). Hence, ap + ayu+ agu? +---+a,u" =0 for some 
a; € F, not all zero. Take f = ap + a1@ + agn? + +++ +anz”. a 


If E D F is an extension of fields, an element u € EF is said to be algebraic 
over F’ if f(u) = 0 for some polynomial f #0 in F[z] (which may be taken to be 
monic since F is a field). An extension ED F is called an algebraic extension 
if every element of & is algebraic over F. Thus, Theorem 1 asserts that every finite 
extension is algebraic. We show later (Example 16) that the converse is not true. 


Example 3. The complex numbers ~/2 and i are algebraic over Q because they 
are roots of z° — 2 and x” + 1, respectively. 


Example 4. Each element a of F is algebraic over F’, being a root of x — a. 


Example 5. The number u = V2 — V3 is algebraic over Q. Indeed u? = 5 — 26, 
so (u — 5)? = 24. This gives u* — 10u? +1 =0, so wis a root of 24 — 102? +1. 


If E DF are fields, an element u € & that is not algebraic over R is called 
transcendental over F’. The reader should not get the idea that all complex num- 
bers are algebraic over Q. This is far from the case, although establishing that a 
given number is transcendental is usually difficult. The next theorem, which we 
state without proof,” identifies two transcendental numbers. 


Theorem 2. The numbers m and e (from calculus) are transcendental. 


In 1873, Charles Hermite gave the first proof that e is transcendental over Q. 
This proof stimulated interest in such questions and, in 1882, Ferdinand Lindemann 
succeeded in proving that 7 is also transcendental over Q. This result is famous, 


"For proofs of these assertions and more information on transcendental numbers, see Niven, I., 
Irrational Numbers, Carus Monograph II, Washington, DC: Mathematical Association of America, 
1956. 


284 6. Fields 


partly because it settled a classical question dating back to the ancient Greeks: Is 
it possible, using only compass and straightedge, to square the circle (construct a 
square whose area equals that of a given circle)? The answer is no, because the 
existence of such a construction implies that 7 is algebraic (see Section 6.5). 

These results are difficult, so it is surprising that there are “more” transcendental 
complex numbers (over Q) than there are algebraic ones. In fact, it can be 
shown that the set A of complex numbers that are algebraic over Q is “countable” 
in the sense that is there exists a bijection A — N. However, C is “uncountable” in 
the sense that no one-to-one map C — N exists. This was first established in 1874 
by Georg Cantor, and he gave another proof in 1891 using his celebrated “diagonal 
method.” 

If E D F are fields, and if uz,...,un are elements of FE, let F(u1,..., Un) denote 
the intersection of all subfields of F that contain F' and also contain each of the 
elements u;. More formally, 


F(u,...,tn) =M{K D F'| K is a subfield of EF containing each u;}. 


It is easy to verify that this is again a field containing F' and all the u;. Thus, it 
is the smallest such subfield of E (in the sense that it is contained in every such 
subfield). The field F'(u1,...,tn) is called the subfield of E generated over F' by 
the elements ui,..., Un. If u € E, the extension F'(u) is called a simple extension 
of F in E. : 


Ezample 6. Show that R(i) = C. 


x 


Solution. C is certainly a field containing R and i, so R(i) C C. However, given 
z=a+bi in C, where a,b€R, z lies in any field containing R and i and, in 
particular, z € R(z). Thus, C C R(z), so C = R(¢) as asserted. O 


Example 7. Show that Q(i, -i) = Q(t). 


Solution. Q(t) C Qi, —7) because Q(i, -7) contains Q and 7. But Q(i) is a field 
containing Q and 2, and so it also contains —i. Hence, Q(z, —2) C Q(i). Oo 


Example 8. Show that Q(/2) = {a + bV2| a,b € Q}. 


Solution. Write E = {a+ bV2 | a,b € Q}. Clearly, E is contained in any subfield 
containing Q and V2; in particular, E C Q(./2). On the other hand, EF is a field 
(by Example 4 §3.2), QC E and V2 € E, so Q(/2) C E. Hence, F= Q(V2). O 


Example 9. If E D F are fields and u € FE, then F(u) = F if and only ifue F. 


If ED F are fields and u and v are elements of £, then FH D F(u) is also an 
extension of fields. Thus, we can speak of F'(w)(v). This is evidently a subfield of 
F containing F,u, and v, and so F(u,v) C F(u)(v). In fact, this is equality: The 
subfield F(u,v) contains F(u) (because it contains F and u), and it also contains 
v, so F(u)(v) C F(u,v). A similar argument proves the next result (Exercise 25). 


Lemma 1. Let EDF be fields and let uj, ue,...,Un be elements of BE, n> 2. 
Then for1<k<n-—1, we have 


F Qiks tip pis aptte) =F (hg etingetay tia), 


6.2. Algebraic Extensions 285 
t 
For fields EF > F, Lemma 1 implies that the subfield F(u1,...,tn) generated over 
F by uj,...,Un can be built up as a chain of simple extensions: 


F(uy, U2) = F(u1)(ua), 
F (uy, ue, us) = F (ui, uU2)(us), 


POE stg Uae ds ta) =F Ot, V5 a) iy) 


This highlights the importance of studying simple extensions F(u). 
If u is transcendental over F, it is routine to verify (Exercise 31) that 


F(u) = {f(ug(u)™ | fg in Fla); g # 0}. 


Hence, F'(u) & F(x)—the field of quotients of the integral domain F(x]. However, 
our interest lies in simple extensions F'(u), where u is algebraic over F’. 

Theorem 1 asserts that every element u of a finite extension F of F is algebraic 
over F’, This has a partial converse: If EF D F is a field extension, and if u € F is 
algebraic over F’, then u belongs to a finite extension of F contained in F. Indeed, 
we show that F’ (u) is a finite extension of F containing u. We present this fact, 
along with an explicit description of F'(u), in Theorem 4. However, the proof 
involves another important notion. 

Let u be algebraic over F, where u lies in some extension of F. Then f(u) =0 
for some nonzero polynomial f € F[z], which we may assume to be monic. 


Theorem 3. Let u be algebraic over F. Choose a monic polynomial m of minimal 
degree such that m(u) = 0. Then 

(1) m is irreducible in F[z]. 

(2) If f is in Fz], then f(u) = 0 if and only if mlf. 

(3) m is uniquely determined by u. 


Proof. (1) Suppose that m = fg in Fx], where deg f < degm and deg g < degm. 
Then f(u)g(u) = m(u) = 0 implies that f(u) = 0 or g(u) = 0, a contradiction. 

(2) If f(u) = 0, use the division algorithm (Theorem 4 §4.1) to write f =qm+r 
in F[a], where r = 0 or degr < degm. Then r(u) = f(u) — g(u)m(u) = 0, so r #0 
would contradict the choice of m. Thus, r = 0 and m|f. The converse is clear. 

(3) Let m’ be another monic polynomial of minimal degree with m’(u) = 0. 
Then m|m’ by (2). Moreover, m'|m because (2) is also true of m’. Thus, m =m’ 
because both are monic (Theorem 9 §4.2). a 


If u is algebraic over F, the polynomial m in Theorem 3 is called the minimal 
polynomial of u over F. Since m is uniquely determined by u and F’ we are entitled 
to define the degree of u over F by degp(u) = deg(m). Note that if p € F(z] is 
monic, irreducible and p(u) = 0, then p = m by Theorem 3(2). 


Example 10. Find the minimal polynomial of u= /1+ V3 over Q. 


Solution. We have u2 = 1+ -/3, so (u? —1)? =3, that is ut — 2u2—-2=0. The 
polynomial 2* — 2x? — 2 is irreducible in Q[z] by the Eisenstein criterion. Hence 
m = x* — 22" — 2 by Theorem 3, so degg(u) = 4. O 


286 6, Fields 


The minimal polynomial provides a lot of information about an element u alge- 
braic over a field F. In particular it gives a useful description of the simple extension 
F(u) generated by u over F, and will be referred to several times. 


Theorem 4. If E D F are fields, let u € E be algebraic over F' of degree n. 
(1) F(u) = {a9 +ayut+--++a,1u"! |a, € Fy n> 0} = {f(u) | f € Ffz]}. 
(2) {1,u,...,u%+} is an F-basis of F(u), so [F(u) : F] =n = degp(u). 
(3) F(u) & F[a]/(m), where m is the minimal polynomial of u over F. 


Proof. Define 6: F[z] — E by 0(f) = f(u). Then @ is a ring homomorphism and 
ker 9 = {f € F [x] | f(u) =0}=(m) by Theorem 3, where m is the minimal 
polynomial of u over F. Then, by the ring isomorphism theorem (Theorem 4 §3.4), 
ala ~imé= Flax 
a (F(u) | f € Fle} 
Now Fu) is a field containing F' and u, and so contains f(u) for all f € F[z]. Hence, 
imé C F(u). But, F[a]/ (m) is a field because m is irreducible (Theorem 3 §4.3), so 
im @ is a field. Because im @ contains F and u, this shows that F(u) C im@. Thus, 
F(u) = im@, which proves (1) and (8). 
It remains to show that B = {1,u,...,u~"} is an F-basis of F(u). To show 
that B is independent, let 


dg +ayut:+++a,-1u™} =0, a; € F. 


Then 9(u)=0, where g=a9+a12+-+-+@n_12""' so g#0 in F[z] would contradict 
the choice of the minimal polynomial m. Hence g = 0, so a; = 0 for all i. Thus B is 
independent. To show that B spans F(u), let f(u) € F(u) and write f =qm-+r in 
F{z] where, since degm =n, r has the form r = bp +030 +++: + bdp-12""1, b; E F. 
As m(u) = 0, we get f(u) = r(u) = bo t+ bu +--+ +bp-1u"-!. Thus B spans F(u) 
and the proof is complete. | 


Note that Theorem 4(3) shows that the field F and the minimal polynomial of 
the algebraic element u completely determine the extension F'(u), and hence that 
F(u) = F(v) whenever u and v have the same minimal polynomial over F. 

The description of F'(u) in Theorem 4(1) makes it clear how to add in F(u). How- 
ever, multiplying requires the minimal polynomial, as Example 11 demonstrates. 


Example 11. Describe the multiplication in Q(u) where u=1+4 7%. 


Solution. We have (u — 1)? = 7? = -1, so u? ~ 2u+2=0. Write m = 2? — 2242. 
Since m is irreducible over Q (it has no root in Q), it is the minimal polynomial of 
u. Hence, Q(u) = {a + bu | a,b € Q} by Theorem 4, and as u? = 2u — 2, 


(a + bu)(a’ + b’u) = aa! + (ab! + ba’)u + bb'u? 
= (aa’ — 2bb') + (ab! + ba’ + 2bd')u. 


This describes the multiplication in Q(u). D 
Example 12. Show that [R : Q] is not finite. 


Solution. The polynomial x” — 2 is irreducible over Q for any n>1 by the Eisenstein 
criterion (Theorem 8 §4.2). If we write E = Q(%/2), this means that [EF : Q) =n 


6.2. Algebraic Extensions 287 


by Theorem 4, Thus, oR contains subspaces of arbitrarily large dimension and so 
cannot be finite dimensional by Theorem 4 §6.1. 0 


Before proceeding, we require the following basic result about finite extensions. 


Theorem 5. Multiplication Theorem. Let K D E D F be fields. Then [K : F) 
is finite if and only if both [K : E] and [E : F| are finite. In this case, 


[K: F]=[K: E]-(E: Fl. 


Moreover, if {e1,...,€m} is any F-basis of pE and {ky,...,kn} is any E-basis of 
EK, then 

B= {ek;|1<i<m1l<j<n} 
is an F’-basis of K. 


Proof. If [K : F] is finite, then [E : F] is also finite by Theorem 8 §6.1 because pE 
is a subspace of -K. Next, any F-basis of rK is certainly an E-spanning set of 7K, 
so [K : E] is finite by Theorem 6 §6.1. 

Conversely, in the notation of the Theorem, it suffices to prove that B is an 
F-basis of rK. First, B spans 7K. For if c € K, then, because {k1,..., kn} is an 
E-basis of K, write c= U9_,bjk;, where b; € E for each j. But then, for each J, 
b; = Li2,a;;e;, where aj; € F for all i and 7. Combining these expressions gives 


c= UF, (LiL aijes) ky = D921 DL ajesky. 
It follows that B spans »K. Finally, to prove that B is F-independent, let 
Vpn Uj a1 Gay esky =0 


where a;; € F for all ¢ and j. Then D9_, (DfL,aijei) kj = 0, so, as {k1,...,kn} is 
E-independent, S72, a;;e; = 0 for eaeh: j. But then a;; = 0 for all i and j because 
{€1,...,@.} is F-independent. Hence, B is F'-independent. a 


The multiplication theorem gives a divisibility relationship between dimensions, 
so it plays a role in field theory somewhat analogous to the role of Lagrange’s 
theorem for groups. Consequently, we refer to it constantly, both in this chapter 
and in Chapter 10 on Galois theory. 


Corollary. Let E D F be fields and let u € E be algebraic over F. If v € F(u), 
then v is also algebraic over F and deg;,(v) divides deg p(u). 

Proof. Here, F(u) 2 F(v) D F and F(u) D F is finite, so v € F(u) is algebraic over 
F by Theorem 1. Also, [F(u) : F'] = [F(u) : F(v)|[F(v) : F] by Theorem 5, so we 
are done because deg» (u) = [F(u) : F'] and degp(v) = [F(v) : F. a 
Example 18. If u = /2, show that Q(u) = Q(u?). 


Solution. We have Q(u) 2 Q(u?) D Q, and [Q(u) : Q| = degg(u) = 3 because x* — 2 
is irreducible in Q{z]. Hence, Q (u?) : Q] = 1 or 3 by the multiplication theorem. 


But [Q(u?) : Q] #1 because u? ¢Q, so [Q(u) : Q] = 3. Thus, Q(u) = Q(u?) by 
Theorem 8 §6.1. 0 


Example 14. Let F D F be fields and let u,v € EF. If u and u+ v are algebraic 
over F, show that v is algebraic over F. 


288 6. Fields 


Solution. Write L=F(u+v) so that L(u)=F(u,v). Hence, v € L(u) so it suffices 
(by Theorem 1) to show that L(u) D F is finite. We have the chain of fields 
L(u) D LD F. But LD F is finite by Theorem 4 because u + v is algebraic over 
F, and L(u) D L is finite because u is algebraic over L (being algebraic over F). 
Hence, L(u) D F is finite by the multiplication theorem as required. O 


Example 15. Let E = Q(V2, V5). Find [E : Ql], exhibit a Q-basis of E, and show 
that B= Q(V2 + 5). Then find the minimal polynomial of /2 + V5 over Q. 


Solution. We write L = Q(./2) for convenience so that E = L(/5) by Lemma 1. 
Now x? —2 is the minimal polynomial of /2 over Q (it has no root in Q), 
so Theorem 4 shows that {1,2} is a Q-basis of L. We claim that x? —5 is the 
minimal polynomial of J/5 over L. Because /5 and —/5 are the only roots of 
z*—5 in R, we merely need to show that /5 ¢ L. But V5 € L means that 
V5=a+by72, with a,b € Q (and a #0,b #0), which implies (by squaring) that 
V2 €Q, a contradiction. Hence, {1,/5} is an L-basis of EF = L(./5) over L. As 
EDL2DQ, it also follows from the multiplication theorem that {1, /2, /5, 10} 
is a Q-basis of FE, and so [E : Q| = 4. 

We now write u= /24+ V5. Then u? = 74+ 2/10, so uv? = 172+ 11V5. In 
particular, 17/2 + 11V5 € Q(u) and V2+ V5 =u € Q(u). Because Q(u) is a field, 
the reader can verify that both V2 and V5 are in Q(u), so E C Q(u). The reverse 
inclusion is obvious, so F = Q(u). Hence [Q(u) : Q] = [EF : Q] = 4, so the minimal 
polynomial m of u over Q has degree 4. But u* = 7 + 2v/10 yields (u? — 7)? = 40, 
from which u‘—14u2+9=0. From Theorem 3, m divides z+ — 14z?+9, so 
m = a* — 14x? + 9 because both are monic of degree 4. Incidentally, this shows that 
x* — 14a? +9 is irreducible over Q. C 


Example 15 shows that Q(/2, V5) is in fact a simple extension of Q. More 
generally, we can show that Q(u,v) is a simple extension whenever u and v are 
algebraic over Q. In fact, this result holds for any field of characteristic 0. In this 
generality it is called the primitive element theorem (see Theorem 6 §10.1). 

We now turn to a characterization of the finite extensions of a field F' as exactly 
those obtained by adjoining finitely many algebraic elements. 


Theorem 6. A field extension E D F' is finite if and only if EF = F(uy,...,tn), 
where each u; € EF is algebraic over F’. 


Proof. First, assume that [EF : F] is finite and proceed by induction on [E: F']. If 
[E:F)=1, thn H=F=F(1). If [E: F]>1, choose we E,u¢g F. Then 
[F(u) : F] > 1 because u ¢ F, so the multiplication theorem gives 


[E : F| 


Bik me : Fi. 
Since EDF (u) is finite, we get E = F(u)(u,...,Un) = F(u,ui,...,Un) by 
induction. The elements wu, ui,...,Un are algebraic over F' by Theorem 1. 
Conversely, if # = F(ui,...,Un), where the u; are algebraic over F’, we again 


use induction on n. If n=1, then [E: F) is finite by Theorem 4. If n > 1, write 
DL=F(u,...,Un-1). Then EH DLDF and [L: F] is finite by induction. But 
E = L(un), so [E: L] is finite by Theorem 4 because uy, is algebraic over F, and 
hence over L. Thus, [F : F is finite by the multiplication theorem. A 


6.2. Algebraic Extensions 289 


The first consequence of Theorem 6 is the version of the first part of the multi- 
plication theorem for algebraic rather than finite extensions. 


Corollary 1. If K D ED F are fields, then K D F is an algebraic extension if 
and only if both K D E and ED F are algebraic. ~ 


Proof. Assume that K D> & and E DF are algebraic extensions. If ué K, we 
show that u is algebraic over F by showing that wu lies in some finite extension of 
F (and invoking Theorem 1). Because K D FE is algebraic, let f(u) =0, where 
O# fc Elz). If f=eptee+-::+enx”, take L = F(eo,-:: ,en). Then u € L(u) 
and we claim that L(u) DF is finite. As L(u) > LDF, this follows by the 
multiplication theorem because (1) L(u) D L is finite (since f € L{z]), and (2) 
L > F is finite (the e; are algebraic because EF D F is algebraic by hypothesis). 
Hence, K D F is algebraic. The converse is routine and is left to the reader. i 


Corollary 2. If E D F are fields, let A= {u € E | u is algebraic over F}. Then A 
is a subfield of E and so is an algebraic extension of F. 


Proof. To prove that A is a subfield of F, it is enough to show that any two 
elements u and v of A lie in some finite extension L D F (then L is algebraic over 
F' by Theorem 1), But L = F(u,v) fills the bill by Theorem 6. a 


The field A in Corollary 2 is called the algebraic closure of F in E. Clearly, it 
is the largest algebraic extension of F that is contained in FE. The following special 
case will be referred to frequently. The field 


A= {ueéC| wis algebraic over Q} 


is called the field of algebraic numbers. The field A shows that while every finite 
extension is algebraic (by Theorem 1), the converse is not true. 


Example 16. Show that A is an algebraic extension of Q that is not finite. 


Solution. Clearly, A D Q(%/2) D Q, so the argument in Example 12 works. Oo 


Exercises 6.2 


Throughout these exercises, F denotes a field. 


1. In each case, show that u € C is algebraic over Q. 


(a) u=V8+V5 (b)u=1+V/1+ 2 
(c) u= V V3 — 2% (d) u=u+v?, where v= 7/2 

2. In each case, show that u € C is algebraic over Q and find the minimal polynomial. 
(a)u= V2+ V3 (b) u= VQ+i 
(c)u=V14+ V3 (d) u=VJ1+i 

3. In each case, decide whether u is algebraic or transcendental over F’ and prove it. 
(a) u= Vr, F = Q(r) (b)u=/7, F=Q 
(c)u=7’?, F=Q (d)u=1l+n,F=Q 

4, Show that u is algebraic over F = Q(v) and find the minimal polynomial if 
(a)u=1+i,v=V2 (b)u=V2,v=1+i 


5. IfueC, u ¢é R, show that C = R(u). 
6. If ED F are fields, show that F(u) = F(au) for alluc F,O#acF. 


290 


Oo aon 


10. 
11. 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


19. 


20, 


21. 


22, 
23. 
24. 
25. 
26. 


27, 
28. 


29. 


30. 


31. 


6. Fields 


. Find the minimal polynomial of u = /3~-i: (a) over R and (b) over Q. 
. Ifz=a+0t€C, a,6€R, find the minimal polynomial of z over R. 
. Show that F(u,v) = F(u) if and only if v = f(u) for some f € F[z]. 


If u = ~/5, find the minimal polynomial of u over Q and that of u?. 

If HDF are fields and ué€ EF, show that u is algebraic over F' if and only if 
[F(u) : F] is finite. 

In each case, find a basis of E over Q. 


(a) B= Q(V2) (b) B= Q(1 -4) 
(c) B= Q(v3, V3) (d) E = Q(v2, V3) 
(e) E = Q(v3, v5) (f) = Q(v2, 73) 


In each case, find [E: F. 

(a) B= Q(v3 + V5), F = Q(v3) (b) EB = Q(v3, V15), F = Q(v5) 

(c) E=Q(v3 +4), F = Q(i) (d) B= Q(¥3, V2), F = Q(v2) 

Let EB D F be a finite extension and let p € F[z] be irreducible. If p(w) = 0 for some 
u € BE, show that deg p divides [E : F). 

If E D F are fields and [E: F] is prime, show that EF = F(u) for alluc E,ug F. 
If BE D F are fields and u € E has odd degree over F, show that F(u) = F(u?). 

If F D F are fields and u € & has prime degree over F, show that the only fields L 
such that F(u) DDD F are LD=F and L= F(u). 

Let ED LDF and ED MOF be fields. If [L: F] is prime, show that either 
MoILoMnNL=F. 

Let CD EDQ, where EF is a field, and assume that [F:Q]=2. Show that 
E = Q(/m), where m is a square-free integer. 

Let K D E D F be fields where [E : F] is finite, and let u € K be algebraic over F. 
(a) Show that [E(u) : E] < [F(u) : F}. 

(b) Show that [H(u): F(u)] < [E: F). 

Let FD F be fields, and let u,w€ EF be algebraic over F of degrees m,n. 

(a) Show that [F'(u,v) : F] < mn. 

(b) If m and n are relatively prime, show that [F(u,v) : F] = mn. 

(c) Is the converse to (b) true? Support your answer. 

If H = F(ui,...,Un) and u; is algebraic of degree m; over F for each i, show that 
[EB : F] <mime-++mny. 

Show that V2 ¢ Q(7). [Hint: Discussion following Lemma 1.] 

Show that [Q(7) : Q(7°)] is finite and display a basis of Q(7) over Q(°). 

Prove Lemma 1. 

(a) If u? is algebraic over F, show that wu is algebraic over F. 

(b) If f(u) is algebraic over F’, f € Flax], f ¢ F, show that u is algebraic over F. 
Show that the intersection of any family of subfields of & is again a subfield of LE. 
Let £ DF be fields and let u,v € &. If u+v is algebraic over F, show that v is 
algebraic over F'(u). [Hint: Treat the case that u is transcendental separately.] 

Is it possible that u ¢ F(v) is algebraic over F(v) while v is transcendental over 
F(u)? Support your answer. [Hint: Exercise 23.] 

Let E DF be fields and let u,v € &. If v is transcendental over F’ but algebraic 
over F'(u), show that u is algebraic over F(v). 

Let & D F be fields and let wu € & be transcendental over F. 

(a) Show that F(u) = {F(u)g(u) | f,9 € Fla); g(a) #0}. 

(b) Show that F(u) & F(a), the field of quotients of the integral domain F(z]. 

(c) Show that every element w € F(u), w ¢ F, is transcendental over F. 


6.3. Splitting Fields 291 


32. Let p and q in Q satisfy /p ¢ Q and \/g ¢ Q(\/). 
(a) Show that Q(./p, /@) = Q(.,/p+ 4/9). 
(b) Use Theorem 5 to find a basis of Q(,/p, /g) over Q. 
(c) Deduce that z* — 2(p + q)x? + (p— 4)? is the minimal polynomial of p+ /@ 
over Q. ; 

33. Let HD F be fields and let A be the algebraic closure of F in FE. If ue FE, u¢ A, 
show that u is transcendental over A. 

34. Let ED F be fields and let {e1,...,em} be an F-basis of FE. If gV is a vector space 
with basis {v1,...,Un}, show that dim(7V) = mn and exhibit a basis of pV. 

35. (a) Let HD RDF, where EDF is an algebraic extension of fields and FR is a 
subring of &. Prove that R is a field. 
(b) Repeat (a) where R is an F-subspace of E, char F # 2, and u € R implies that 
u® € R for all k > 2. 
(c) Show that (b) if false if char F = 2. [Hint: Let F be the quotient field of Za[z, y].] 


6.3 SPLITTING FIELDS 


So far our discussion of an algebraic element u over a field F has concerned a 
given extension field E > F that contains u, and we have described the field F'(u) 
explicitly as a subfield of &. However, Theorem 4 §6.2 shows that 


Fic] 
(m)’ 
where m is the (irreducible) minimal polynomial of u over F and degm = n. Hence, 
F(u) does not depend on E, being completely determined by u and F. 
So we change our perspective. Suppose that f is a nonconstant polynomial in 


F[z] (possibly not irreducible), where F is a field. Is there a field # D F' in which 
f has a root? Leopold Kronecker solved this problem in the nineteenth tentury. 


F(u) = {ap +ayut++++@n-1u™ | | a, EF} S 


Theorem 1. Kronecker’s Theorem. If F is any field and f is a nonconstant 
polynomial in F[z], there is an extension field of F in which f has a root. 


We proved this assertion earlier (Theorem 4 §4.3). The idea is simple. Because f 
has positive degree, it has a monic irreducible factor p. Hence, F = F{x]/ (p) is a 
field (by Theorem 3 §4.3) that we explicitly describe as follows: We regard F as 
a subfield of E by identifying a=a-+ (p) for all a € F. In addition, if we write 
t=a-+ (p), the elements of F take the form 


E = {a9 + ayt + agt? +--+ + an-1t" | a; € FI, 


where n = deg p. Moreover, {1,t,...,¢"~} is an F-basis of E, so [E': F] = n, where 
n = deg p. Finally, p(t) = 0 (easily verified; see Lemma 4 §4.3), so f(t) = 0 because 
p is a factor of f. Thus, ¢ is the desired root of f in FE. Of course, F = F(t) in 
the notation of Section 6.2, and p is the minimal polynomial of t over F’. This not 
only proves Kronecker’s theorem, but also provides a means of constructing field 
extensions of degree n over F by using monic irreducible polynomials of degree n. 

Since every polynomial f is a product of irreducible factors, we can repeat this 
procedure and construct an extension of F' over which f factors completely into 
linear factors. Here is an example where f is already irreducible. 


292 6. Fields 


Example 1. Find an extension E D Z in which f = 2° +z +1 factors completely 
into linear factors. 


Solution. The polynomial f is irreducible over Zz (it has no root in Ze) so 
E = {ao + ayt + agt? | a; € Zo, f(t) = 0} 


is a field containing a root t of f by Theorem 2 §4.3. Hence, x +t = z —t is a factor 
of f. The division algorithm gives f = (x+t)g, where g = 2? +tx+(1+1?), so it 
suffices to show that g also factors completely in E. Trial and error gives g(t”) =.0, 
so g = (x + t?)(a + v) for some v € F. Comparing coefficients of x gives t = t? + v, 
whence v=t + t?. Thus, g=(@+2?)@+t+1??), andsof=@+t)@+)@t+t+t?) 
factors completely in E[z]. oO 


The following terminology is commonly used. Let f be a polynomial in F'[z] of 
degree n > 1, where F is a field. An extension field E D F is called a splitting 
field of f over F if the following conditions are satisfied: 


(1) f = a(x — u1)(@ — ua) +++ (@— Un), @ E F, u; € E for each i. 
(2) HB = F(u1,ue,...,Un). 


If (1) holds, we say that f splits over E or that f splits in E[z]. Hence, in Example 
1, the field F is a splitting field of f over Zz because F = Zo(t). 

If E is a splitting field of f over F, the only subfield of F (containing F') in 
which f splits is E itself. Indeed, u,..., un are the only roots of f in any subfield 
L of E containing F, so if f splits over L, then L = E by (2). 


Example 2. The field F is itself a splitting field of every linear polynomial in Fz]. 


Example 3. The field Q(é) is a splitting field of x? +1 over Q because we have 
z?+1=(2+4)(x —7) and Q(i, —i) = Q(t). 


For a nonconstant polynomial f in Q(z], the fundamental theorem of algebra 
asserts that f splits in C[a]. If the roots are uj,...,Un, then B = Q(us,...,Un) 
is a splitting field for f (which is contained in C). The next theorem shows that 
splitting fields always exist (though they need not be subfields of C). 


Theorem 2. Let f be a polynomial of degree n > 1 over a field F. Then a splitting 
field E D F of f over F exists and [E:: F] < nl. 


Proof. Use induction on n> 1. Ifn=1, take H= F. If n >1, let p be a monic 
irreducible factor of f, and by Kronecker’s theorem, let F D F bea field containing a 
root u, of p (and thus f). Put £; = F(uz) so that wu; € Ey and [fy : F| = degp <n. 
Now f = (« — uz)g in E,[z], where deg g = n — 1. Hence, by induction, let Hy D Fy 
be a splitting field for g with [E2 : E,] < (n—1)!. Then g = a(x — ug)---(4@ — Un), 
where a € F; and u; € Bo for each i, so Ey = Ey(ug,..., tn) = F(ur, u2,..-, Un) 
is a splitting field for f over F. Finally, 


(fo: F) = (Bo: Fil[F: F] <(n-D!-n=n! 


by the multiplication theorem, which completes the proof. 


6.3. Splitting Fields 293 


Example 4. Find a splitting field E > Q of f = a* — 2x? — 3, where [E: Q| = 4. 


Solution. Because f = (£? — 3)(x? + 1), the roots of f in C are +/3 and +i. Hence, 
E = Q(¥V3,?) is the required splitting field. We have E = L(i), where L = Q(V3) 
and [E : L] = 2 = [LZ : Qj, as the reader can verify. Hence, [E : Q] = 2.2 =4 by the 
multiplication theorem. O 


Example 4 shows that if a polynomial f has degree n over F, the degree over 
F of its splitting field can be much smaller than the bound of n! in Theorem 2. 
Nonetheless, this bound is the best possible, as Example 5 shows. 


Example 5. Find a splitting field E of f = 2? — 5 over Q such that [EF : Q] =6. 


Solution. The roots of f in C are u, uw, and uw?, where u = 7/5, and w = e27*/3 
is a cube root of unity. Clearly, E = Q(u, uw, ww?) = Q(u, w) is a splitting field. 
Write L = Q(u). As f is irreducible over Q, it is the minimal polynomial of u, so 
[L : Q| = 3. Since [E': Q] = [E: L][L : Q,, it remains to show that [EF : L] = 2. As 
E = L(w), this follows if we can show that the minimal polynomial of w over L has 
degree 2. Note that 
f=22 —u8 = (2 —u)(2? + ur 4+ vu?) 

in L[z]. If we write p = x? + ux + u?, then p(w) = 0 because f(w) = 0 and w # wu. 
Similarly, p(w?) =0, so p has no root in L (DCR but w¢R and w? ¢R). 
Thus, p is irreducible over L, and so it is the minimal polynomial of w. Hence, 
[E : L| = deg p = 2, which completes the argument. O 


Example 6. If F is any field, show that any splitting field of a quadratic f in F'[a] 
is a simple extension F'(u) of F. 


Solution. Let f = ax*+ba+c, a#0, and let EDF be a splitting field of f. If 
u,v € E are the roots of f, then f = a(x — u)(x — v), so comparing coefficients of 
z gives b= —a(u+v). Thus v = —u—a “be F(u), so E=F(u,v)=F(u). O 


Example 7. Two different irreducible polynomials can have the same splitting 
field. For example, both x? — 2 and x? — 22 — 1 have splitting field Q(./2) because 
the roots are +,/2 and1+ V2, respectively. 


Example 7 notwithstanding, the splitting field of a polynomial f in F'[s} is 
uniquely determined by f up to isomorphism. In fact, we prove a slightly stronger 
result that utilizes a commonly occurring concept in field theory. 

Let RD F and RD F, where F and F are 6 
fields that are subrings of the rings R and R, R 
respectively. Given a ring isomorphism | 


o:F — F, aring isomorphism 6: R — R is 

said to extend o if 6(a) = a(a) holds for every a 

a € F (see the diagram). B 7 
An important instance occurs in the following context. Let 0: F— F be 

an isomorphism of fields. Given f € F[z], define a new polynomial f? € F[z] as 

follows: If f = a9 +ay@+--:+ 4,2"; a; € F, let 


f° =o(a9) + o(a1)a+---+0(an) 2”. C 


Then the mapping F [2] — F[z] given by f +> f? is aring isomorphism that extends 
o (Exercise 16). This mapping is very useful; our present interest in it is as follows. 


R 


294 6. Fields 


Suppose p is a monic irreducible polynomial in F[z]. Then p? is monic in F[z] 
(as o(1) = 1) and irreducible (because deg f* = deg f for all f € F'[z]). Now define 


by — o(f + (p)) = f° + (p?) () 


for all f € Fz]. Then y is well defined (and one-to-one) because 


f+(p)= 9+ (rp) = pi(f—g) in Fle] 
= p|(f? —g®) in Fle] 
> f? + (p?) = 9? + (p’). 
Now the fact that ft f% is a ring isomorphism shows that y is also a ring 
isomorphism. We need this result in the proof of Theorem 3. 


Theorem 3. Let o : F > F be an isomorphism of fields. Given a monic irreducible 
polynomial p in F[z], let u be a root of p in an extension field E > F and let v be 
a root of p? in an extension field EK > F’. Then there is a unique isomorphism 


F(u) > F(v) given by f(u) > f7(v), f € Flz| 
that extends o and carries u to v. 


Proof. The polynomial p is the minimal polynomial of u over F. Hence, as in the 
proof of Theorem 4 §6.2, the mapping 6: F'[z] — F(u) given by 0(f) = f(u) is an 
onto ring homomorphism with ker @ = (p). This mapping induces an isomorphism 
F(u) = F{a]/(p) given by f(u) o f+ (p). Similarly, F(v) & F[z]/ (p7). Now 
compose these mappings with the isomorphism y in (**) to get 
Fl] 6 Fla] . 
F(u) - = — Fiv), 
" oy @) . 

flu) rm f+) re fe + p72) HH fev). 
Hence, the composite map f(u) ++ f?(v) is an isomorphism F(u) > F(v). This 
map carries u to v (take f =a), so it remains to verify that it extends o. If 
a € F, let g =a be the corresponding constant polynomial. Then g(x) = (a), so 
a = g(u) + g’(v) = o(a), as required. The uniqueness is Exercise 16. 


The special case of Theorem 3 where F = F is worth mentioning. Let p be a 
monic irreducible polynomial and let u and v be two roots of p in suitable extension 
fields E D F and E D F. Then the fields F(u) and F(v) are isomorphic. In fact, if 
deg p = n, the map G : F(u) — F(v) given by 


Oe) n-1 


G(a9 +ajut-+++@n-1U = ao tayut+-+++Gn-1U 


is an isomorphism that carries u to v and fixes F’ in the sense that é(a) = a for all 
a € F (that is, 6 extends the identity map lp: F > F). 

Theorem 3 will be used repeatedly in Chapter 10. For now, our chief interest in 
the result is that it enables us to prove that splitting fields are unique. 


Theorem 4. Let ¢: F—F be an isomorphism of fields, let f € F[z] be a 
nonconstant polynomial, and let f° € F[z] be as given in (*). If EDF is a 
splitting field for f and E D F is a splitting field for f?, there is an isomorphism 
E — E that extends o. 


6.3. Splitting Fields: 295 


Proof. Use induction on n = deg f = deg f” (see the diagram). If n = 1, then 

E=F and E =F, soc itself is the required map. If E E 

n> 1, let u € EF be a root of a monic irreducible 

divisor p of f and let v € E be a root of p’. Then o | 

extends (by Theorem 3) to an isomorphism 

7: F(u) > F(v) such that r(u) = v. Now write F(u) 
F 


T 


F(v) 


f =a(z—u)(z@—u1)...(@ — Un) in Elul, a € F, 
u; € E, v; € E, and define g = a(a — u)...(@ — un). 
Then FE = F(u)(ug...Un) is a splitting field of g over L 
F(u). However, f? = f? = [x —1(u)]g” = (x — v)9", 
and hence £ is a splitting field of g’ over Fv). Then, by induction, there is 
an isomorphism E — FE that extends r and so extends o. This completes the 
proof. | 


Example 8. Find a\splitting field for f =a24+23+2?+1 in Zo[z], and factor f 
completely in Za[z]. 


Solution. We have f(1) = 0 because char Zp = 2, so f = (x + 1)(x? + 2 +1) by the 
division algorithm. Now g = z° +z +1 is irreducible over Ze (it has no root in Zs) 
so, as in Example 7 §4.3, we obtain a field 


E = {ap + ait + agt? | a; € Zp and t? = 1+ th. 


Then t is a root of g, so g=(x+t)(x?+t2+(1+t?)), again by the division 
algorithm. Now it remains to determine if h = 2? +tx + (1+ t?) splits in E[z]. If 
we set h(a + bt + ct?) = 0, where a,b,c € Zo, then comparing coefficients of z (and 
using the fact that d? = d for any d € Z2), we obtain the roots ¢? and t + t?. Hence, 
h=(a4+t?)(2@+t+t?) and we obtain f=(c¢t+1)(x+t)(c@+t?)(2+t+t?), a 
complete factorization in E[z]. In particular, FE is a splitting field for f. O 


Algebraic Closures 


In view of all our efforts to find splitting fields for polynomials, the fact that every 
nonconstant polynomial in C[z] splits is remarkable, to say the least. The next 
theorem characterizes when this happens. 


Theorem 5. The following conditions on a field C are equivalent: 
(1) Every nonconstant polynomial in C[z] has a root in C. 
(2) Every irreducible polynomial in C[x| has degree 1. 
(3) Every nonconstant polynomial in C{x] splits in Cz]. 
(4) If E DC is an algebraic extension, then E = C. 


Proof. (1) = (2) = (3). These are left to the reader. 

(3) > (4). Ifu € E, let f(u) = 0, where f is a nonzero polynomial in C[z]. Then 
f is not constant, so by (3), f = a(a — b,)(x — be)-- +(x — b,), where a,b; €C. Thus, 
f(u) =0 means that u = b; for some 7, so u € C proving (4). 

(4) => (1). If f is a nonconstant polynomial in C[z], let u be a root of f in some 
extension field E by Theorem 1. Thus, C(u) D C is an algebraic extension (it is 
finite by Theorem 4 §6.2), so C(u) = C by (4). Hence, u € C and (1) follows. 


296 6. Fields 


We can express condition (4) in Theorem 5 by saying that the field C has no proper 
algebraic extension. With this in mind, we say that a field C is algebraically 
closed if it satisfies the conditions in Theorem 5. The fundamental theorem of 
algebra (Section 6.6) asserts that each polynomial in C[{z] has a root in C. Thus: 


Example 9. C is algebraically closed. 
If E D F is a field extension, Corollary 2 of Theorem 6 §6.2 shows that 
A= {u€é E | wis algebraic over F} 


is a subfield of E containing F, called the algebraic closure of F' in FE. The field A 
is clearly the largest algebraic extension of F' contained in E. If E is algebraically 
closed, we have Theorem 6. 


Theorem 6. If C D F are fields and C is algebraically closed, then the algebraic 


closure 
A={u€C | u is algebraic over F'} 


of F in C is itself algebraically closed. 


Proof. If f is a nonconstant polynomial in A{z], then f € C[z], so f has a root uin C 
by hypothesis. By Theorem 5, we must show that ué A. If f=ao+aye+-+++an2”, 
a; € A, write FE = Flag, a1,...,@n]. Then [E : F| is a finite extension (by Theorem 
6 §6.2 because each a; is algebraic over F'), and [E(u) : E] is finite because wu is 
algebraic over E. Hence, E(u) D F is finite and so algebraic. Since u € E(u), it 
follows that u is algebraic over F' as required. | 


If F is a field, a field extension A D F is called an algebraic closure of F' if A 
is an algebraic extension of F' that is algebraically closed. Thus, an algebraic closure 
of a field F' is a maximal algebraic extension of F' in the sense that it is an algebraic 
extension with no proper algebraic extension (by Theorem 5). Theorem 6 shows 
that any subfield F of an algebraically closed field C has an algebraic closure (its 
algebraic closure in C). Recall that the field A of algebraic numbers is defined by 


A= {u €C| wu is algebraic over Q}. 
Then specializing Theorem 6 to the case F = Q, C=C, gives 
Corollary. The field A of algebraic numbers is an algebraic closure of Q. 
In fact, we have Theorem 7. 


Theorem. 7. Every field F has an algebraic closure A > F. Moreover, if A’ > F is 
another algebraic closure, there is an isomorphism 0: A — A’ that fixes F. 


The proof requires Zorn’s Lemma (see Appendix C) and is omitted.” 


Leopold Kronecker (1823-1891) Kronecker was born into a prosperous family and in 
addition to his mathematical studies, he actively pursued business interests in his early 
years. He was so successful that by the time he was 30, he could afford to devote himself 
entirely to mathematics. He eventually succeeded his teacher Kummer as professor at 
the University of Berlin. 


72See McCarthy, P.J., Algebraic Extensions of Fields, Waltham, MA: Blaisdell, 1966, p. 22. 


6.3. Splitting Fields 297 


Kronecker worked primarily in algebraic number theory, and he is said to be one of the 
inventors of the theory (with Kummer and Dedekind). He produced mathematics of 
first quality and was one of the first algebraists to understand thoroughly the work of 
Galois. However, he insisted on dealing only with numbers such as V2, which he could 
construct from the rational numbers by a finite process. He categorically rejected the 
real-number constructions of his day that, using infinite limiting processes, gave meaning 
to transcendental numbers such as 7. He used to say, “God made the integers, and all 
the rest is the work of man.” 


This point of view brought him into conflict with Karl Weierstrass and Georg Cantor who 
were creating modern analysis and set theory. While they remained friends, Kronecker 
and Weierstrass debated this issue all their lives. However, Kronecker’s attack deeply 
affected the hypersensitive Cantor and was likely a factor in the breakdowns he suffered 
in his later years. Cantor subsequently was awarded the recognition he deserved, whereas 
Kronecker’s point of view found little support among mathematicians of the day. 


Exercises 6.3 


Throughout these exercises, / denotes a field. 


1. In each case, find the splitting field F of f over Q and find [E : Q]. 
(a) f=a3+1 (b) f=at+1 
(c) f=2* ~62*—7 (d) f=2% +223 -3 
2. (a) Find the splitting field of f = «+ — 2x3 — 7z? + 10x + 10 over Q. 
(b) Find the splitting field of f = 2++23 +227 +241 over Q. 
3. If 20 in the field F, show that the splitting field EF of c+ +1 over F is a simple 
extension of F and factors z*+ 1 completely in E[z]. What happens if 2=0 in F? 
4. In each case, find the splitting field E& of f over F and factor f completely in EF. 


(a) f=22°4+1,F=Z, (b) f=22+1, F=Z3 
(c) f=ae®+e?41,F=Z, (d) f=23-24+1,F=Z, 
(ce) f=at-a?-2,F=Zs (f) fHat+e3+e4+1, F=Zp 


5. Show that 2? — 3 and a? — 2” — 2 have the same splitting field. 
6. (a) Is C the splitting field of some polynomial over Q? Support your answer. 
(b) If f € R[x] is nonconstant, show that R or C is a splitting field of f over R. 
7. Let f = gh in F[x} where g and h are nonconstant. If E is a splitting field of f over 
F, show that g splits in E[z]. 
8. Let E D F be a splitting field of f over F. If [EF : F] is prime, show that F = F(u) 
for some u in & (that is, FE is a simple extension of F’). 
9. Let f and g be polynomials in F'[z]. Show that f and g are relatively prime in F{z] 
if and only if they have no common root in any extension FD F. 
10. If f € F[z], show that ED F is a splitting field for f if and only if f splits over E 
and not over any proper subfield of # containing P. 
11. Let HD LD F be fields and let f € F[z]. If E is a splitting field for f over F, show 
that F is also a splitting field of f over LD. 
12. If f,g € F [a], show that any splitting field of fg contains splitting fields of f and g. 
13. Let w = e?"*/? be a pth root of unity, where p a prime. Show that Q(w) is the splitting 
field of z? — 1 over Q and that [Q(w) : Q] = p— 1. [Hint: Example 13 §4.2.] 
14. Let f € Fiz] and let g = f(ax+b), a #0, b in F. From Exercise 12, assume that 
K 2D F is a field containing splitting fields E and D of f and g, respectively. Show 
that B= L, 


298 6. Fields 


15. Let f and g be monic and irreducible in F[x] with relatively prime degrees. If u is a 
root of g in some extension field EH D F, show that f is irreducible over F(u). [Hint: 
Use Theorem 1 to find a field K D FE in which f has a root v. Apply Exercise 21 §6.2 
to show that f is the minimal polynomial of v over F(u).] 

16. Show that the isomorphism F(u) — F(v) in Theorem 3 is uniquely determined by 
the condition that it extends o and carries ut v. 

17. If ¢: F > F is an isomorphism of fields, prove that fH» f7 is a ring isomorphism 
F(z] + Fz] that extends o (see the discussion preceding Theorem 3). 

18. If ED F is an algebraic extension of fields and every polynomial in F'[s] splits over 
E, show that EF is algebraically closed. [Hint: Corollary 1 of Theorem 6 §6.2.] 

19. Show that 7 is not algebraic over the field A of algebraic numbers. 

20. (a) Find the algebraic closure A of Q in F = Q(i, 1). [Hint: Exercises 19 and 31 §6.2.] 
(b) Is A algebraically closed? Support your answer. 

21. Show that the following conditions are equivalent for fields # D F: 
(1) E is the splitting field of a polynomial in F[z]. 


(2) [E : F) is finite and every irreducible polynomial E E(v) 
in F[z] with a root in E splits completely in E{z]. 

Algebraic extensions with the second property 
in (2) are called normal extensions. F(u) F(v) 


[Hint: For (1)=>(2), let p in F'[x] be irreducible 
with a root u in E and let v be a root of pina i cee ys 

field K D £. Find an isomorphism ¢ : F'(u) > F(v) F 

and apply Theorem 3 to conclude that E & E(v). Then argue that # = E(v), so 
v € E. For (2)=(1), use Theorem 6 §6.2.] 


6.4 FINITE FIELDS 


The theory of finite fields is satisfying because they can be completely classified. 
Galois introduced this subject in his investigation of the insolvability of polynomial 
equations. Apart from its intrinsic interest, the subject has applications in group 
theory, combinatorics, and coding theory, among other areas. Of course, when we 
speak of a finite field F, we mean that its order |F| is finite. 

If F is a finite field, our first observation is that F' has characteristic p for some 
prime p. Therefore, F contains a copy of the field Z, of integers modulo p. It is 
customary (and we shall do so) to identify Z, with the prime subfield of F, that 
is, Z» C F. In particular, this means that F is a vector space over Zp and so has 
a basis {uy,-++, un}. Thus, the elements of F are uniquely represented in the form 
a1U1 +++: + Gntin, a € Zp, by Theorem 3 §6.1. There are p independent choices for 
each coefficient a;, so we have 


Theorem 1. If F is a finite field, then |F'| = p" for some n > 1, where p = char F. 


Theorem 1 leads inevitably to two questions: Is there a field of order p” for 
each prime p and integer n > 1? If so, is it unique? The answer to both questions 
is yes (Theorem 4). One method of constructing a field of p” elements is already 
available: If f is an irreducible polynomial of degree n in Z,[z], then the factor ring 
Zp{t]/(f) is a field of order p” by Theorem 3 §4.3. The problem is that we have no 
guarantee that such a polynomial f exists. 


6.4. Finite Fields 299 


To motivate the procedure we use, suppose for the moment that a field F exists 
with |F'| = p” elements. Then F* is a group with p” — 1 elements so, by Lagrange’s 
theorem, a?"~1! = 1 for alla #0 in F. Hence, a?” = a, so as this also holds if a = 0, 
every element of F is a root of the polynomial x?” ~ 2. Hence, the approach we take 
is to show that the splitting field of 2?" — x over Z, is a field of order p”. This 
method has the added virtue that the uniqueness of the field then comes from the 
uniqueness theorem for splitting fields (Theorem 4 §6.3). Moreover, we can then 
prove the existence of an irreducible polynomial of each degree over Zp. 

The construction of the splitting field of x?" — x requires two preliminary obser- 
vations. The first is related to the binomial theorem: If F is any field of characteristic 
p, and if a and 6 are elements of F’, then 


(a+b)/P=aP+b?  foralla,beF 


by Theorem 2 §3.4.7° hus, the mapping o : F — F given by o(a) =a? is a ring 
homomorphism. It is one-to-one because F' is a field, and so is onto because F' is 
finite. Thus, o is an automorphism, called the Frobenius automorphism of F. 

The second result we need to compute the splitting field of z?” — zx is a condition 
guaranteeing that a polynomial in Fz] has distinct roots in any splitting field. This 
requires a purely algebraic version of the derivative of a polynomial. 

Let f = a9 +012 + aot? +++» +4n2" be a polynomial in F[z]. The derivative 
of f is the polynomial f’ in F[z] defined by 

fi =a) + 2age +--+ +nanz™?. 

In particular, if f = az*, where k > 0, then f’ = kazx*-!. This relation holds even if 
k = 0, because the derivative of a constant polynomial is 0. Note that this definition 
of the derivative does not involve limits as in calculus. Nonetheless, the usual rules 
of differentiation hold. 


Theorem 2. Let f and g be polynomials in F|x], where F is a field. 

(1) (af) =af' for alla é F. 

(2) (ftg)=fitg. 
(3) (f9)'=fo't+f'g. 
(4) {flo@)]} = flow) 9'(2). 
Proof. (1) and (2) follow immediately from the definition. To prove (3), write 
f=ao tae +agz?+---. If y is an indeterminate over F(a], compute 

f(x) — f(y) = ax(u — y) + a(x? — y?) + a3(a9 —y?) +-- 
(x —y)[a1 + a2(a + y) + a3(a? + ay ty?) +-°°] 
= (w—y)folz,y), 

where fo(x, y) is a polynomial uniquely determined by f. Our interest in this is the 
observation that fo(«, x2) = f’. Compute in the field of quotients of F'[z, y]: 


p(x) — p(y) -f; fn) 2) ze 2 fe) P 


po(a,y) = zy ry ZY 


= f go(x,y) + folz,y) 9. 
Now (3) follows by taking y = z, and a similar argument gives (4) (Exercise 16). Hf 


73This formula is sometimes called the “freshman’s dream.” 


300 6. Fields 


Thus, we can compute derivatives of polynomials over any field just as we do over 
R in calculus. 

If f € Fla], F a field, an element a of F is called a repeated root of f if 
f =(x—)*g for some g € F(z]. Here is a simple test for the existence of repeated 
roots, 


Theorem 3. Let f be a polynomial in F(x], F a field, and let a € F. Then: 
(1) (x — a)? divides f if and only if (x — a) divides both f and f'. 
(2) If f(a) =0 then (x — a)? divides f if and only if f’(a) = 0. 


Proof. lf f = (x — a)*g, then f’ = (x — a)|(x — a)g’ + 2g] by Theorem 2. Conversely, 
if f = (w —a)h, then f’ = (c—a)h’ +h. Thus, (x—a) divides h because +—a divides 
f', so (x — a)? divides f. This proves (1), and (2) is now clear. a 


We can now prove the main theorem of this section. 


Theorem 4. Let p be prime, let n > 1 be an integer, and write f = x?” — 2. 
(1) Any field F with |F'| =p” is a splitting field of f over Zp. 
(2) Every splitting field of f over Z, has order p”. 

Hence, a field of order p” exists and is unique up to an isomorphism fixing Zp. 


Proof. (1) Assume that |F| = p”. We observed above that every element of F is a 
root of f. As deg f =p”, f can have at most p” roots in any field. Thus, the fact 
that |F'| = p™ implies that f factors into linear polynomials in F[z]. Hence, F is a 
splitting field for f. 

(2) Let K be a splitting field of f over Z, and let Ky = {a € K | f(a) = 0} de- 
note the set of roots of f in K. We have f’ = —1 #0, so f has distinct roots in K 
by Theorem 3. Because f splits in K and deg f = p”, this implies that |Ko| = p”. 
Hence, it suffices to show that Ko is a subfield of K (then Kp = K because K 
is generated by the roots of f). To this end, let o: K > K be the Frobenius 
automorphism given by o(a) = a?. Then o?(a) = o(a?) = o(a)? = a?” and an easy 
induction gives o”(a) = a?". This means that Ko = {a € K | o"(a) = a}. Because 
o” is an automorphism of K, it follows that Ko is a subfield of K, as required. 

Finally, the existence of a field of order p” follows from (2) and Theorem 2 86.3. 
The uniqueness is by Theorem 4 86.3. | 


If p is a prime and n > 1 is an integer, the unique field with p” elements is called 
the Galois field of order p” and is denoted GF(p"). 


Example 1. GF(p) = Zp for each prime p. 


We have already constructed the Galois fields GF'(4) and GF(8) (Example 7 §4.3 
and Example 1 §6.3), by using the fact that 27+ 2+ 1anda° ++ 1 are irreducible 
over Zy. The polynomial x? + 1 is irreducible over Zp for any prime p congruent to 
3 modulo 4 (Example 6 §4.3), which yields GF(p?) in this case. However, finding 
an irreducible polynomial of degree p” over Zp is not easy (although we will show 
in Corollary 2, Theorem 7, that one must exist). 


Example 2. Show that z* + x + 1 is irreducible over Zz and so construct GF(16). 


Solution. It suffices to show that f = 24 +2 +4 1 is irreducible; then Theorem 3 §4.3 
gives GF'(16)={a+ bt + ct? + dt? | a,b,c,d € Zo; t* =t+ 1}. Suppose that f is not 


6.4. Finite Fields 301 


irreducible. Because f has no root in Zz, it must factor as f = pq, where p and q are 
quadratics in Z2[z]. But then p=q=2? + x + 1 (the other quadratics are 2”, 2? + 1, 
and x? + 2, and all have a root in Z2). Hence, f=pg=(z? +241)? =a44+2? 41, 
a contradiction. O 


If G is a cyclic group of order n, G has a subgroup of order m if and only if 
m|n and, in this case, there is exactly one subgroup of order m (Theorem 9 §2.4). 
However, the problem of describing all subgroups of an arbitrary finite group G is 
very difficult (although much can be said if G is abelian). In the case of finite fields, 
however, we can describe the subfields of GF(p") explicitly. 


Theorem 5. Let p be prime and let n > 1 be an integer. 
(1) If K is a subfield of GF(p"), then K = GF(p™) for some m with m|n. 


(2) If m|n, there is exactly one subfield of GF(p") of order p™, and it consists 
of the roots of z?™ — a in GF(p”). 


Proof. (1) Write E = GF(p"). Given a subfield K C FE, then char K =p, so 
Zp © K. Also, K = GF(p™) for some m <n by Theorem 1. In fact, m|n by the 
multiplication theorem: n = [EF : Zp] = [E: K][K : Zp] =[E: K]m. 

(2) Observe that 2°? — 1=(x% — 1)(~2-@ + 4%>-224....49 +4 1), Consequently, 
ifn = mk, we have p” — 1 = (p™ — 1)q for some g € Z. Hence, 

a?” —a@ = a(axP"-1 — 1) = 2(2P"-! — 1)g = (a? — 2)9, 

where g € F[x]. Hence, (z?” — 2) splits in E[z] since 2?” — x does. If we define 
Eo = {u € E | wis aroot of x?” — x}, then |Eo| = p” because the roots of (x? — x) 
are distinct. Moreover, Eo is a field as in the proof of Theorem 4. 

Now let K C E be any subfield with |K| =p”. Then K C Eo by Theorem 4. 


But then K = Eo because || = |Eo|, proving (2). | 
Example 3. Draw the lattice diagram of the subfields of GF(p*). 

Solution. By Theorem 5, the subfields yer (p°) Se 

are GF(p)=Zp, GF(p*), GF(p?), and GF (p?) GF(p”) 
GF(p°). The lattice diagram is shown at << a 

the right. O GF(p) 


If f € Z{x] is monic, the modular irreducibility test (Theorem 7 §4.2) asserts 
that if f is irreducible in Z,[z] for some prime p, then f is irreducible in Q(z]; that 
is, it has no proper factorization in Z[z]. But the converse is false by Theorem 5: 


Example 4. Show that x* + 1 is irreducible in Q[z], but is reducible in Z,[z] for 
every prime p. 


Solution. By Theorem 9 §4.1, 24+1 has no root in Q. If f = («?+ ax + b) (x? +cx+d) 
in Z[z], then a+c=0, b+ac+d=0 and bd =1, so a? = +2, a contradiction. So 
z* + 1 is irreducible in Q[z]. 

Write E = GF(p”) for convenience, and regard Zp C E. Suppose that 24 +1 
is irreducible over Zp. Then p is odd (otherwise z* + 1 = (x? + 1)?), so 8|(p? — 1). 
Hence 2° — 1 divides x”’~! — 1 in Z[z] so, since 28 — 1 = (a* + 1)(x4 — 1), we have 
a?! _1 = (a++1)q for some q € Z{z]. But then (2*+1)g has p? —1 distinct 
roots in E by Theorem 5, and it follows that «+ + 1 has a root u € E\ Z,. Hence 
[Z,(u) : Zp] <4 <p? so ED Z,(u). Thus E D Z,(u) D Zp, a contradiction because 


302 6. Fields 


the only subfields of E are Zp, and E (by Theorem 5). So x* +1 is reducible over 
Zp after all. Cl 


Note that the second half of Example 4 also follows from the corollary to Fermat’s 
theorem (Theorem 8 §1.3). 
Before proceeding, recall the following results about a group G: 


Lemma 1. Let G be a group and let z,y € G. Then: 
(1) If o(z) = pq, « € G then o(z”) = q. 
(2) If o(z) = m, o(y) =n, gcd(m,n) = 1 and zy = yz, then o(zy) = mn. 


Proof. (1) is Theorem 5 §2.4. Given x and y as in (2), it is clear that (zy)™ = 1. 
If (zy)? = 1 then 1 = (xy)?" = y®”. Hence n | dm, so n | d because ged(m,n) = 1. 
Similarly m | d, so mn | d, again because gcd(m,n) = 1. This proves (2). | 


Lemma 2. IfG is a finite abelian group, let a be an element of G of maximal order. 
If o(a) = m then g™ = 1 for every g EG. 


Proof. Suppose g™ # 1 for some g € G, and write o(g) = n. Then n does not divide 
m so there exists a prime power p*, e > 1, such that p* | n but p® does not divide 
m (nis a product of prime powers with distinct primes). If we write n = p*q then 
o(g*) = p* by Lemma 1(1). 

On the other hand, write m = p’k where t > 0 and p does not divide k. Then 
o(a®’) =k, again by Lemma 1(1). But k and p® are relatively prime so Lemma 
1(2) gives o(a?‘g?) = kp > kpt = m because p* >p* (p* does not divide m). This 
contradiction proves Lemma 2. | 


We can now prove that the group of units F* of a finite field F’ is a cyclic group, 
a result due to Galois. In fact, we get a stronger result with the same effort. 


Theorem 7. Let F' be any field. If G is a finite subgroup of the multiplicative 
group F* of F, then G is cyclic. In particular, if F is finite, then F* is cyclic. 


Proof. Let a € G have maximal order m. Then g™ = 1 for all g € G by Lemma 2. 
Thus, every element of G is a root of 2” — 1, so |G] <_m by Theorem 8 §4.1. But 
then m = |(a)| < |G| < m, so G = (a) is cyclic, as required. | 


Note that Theorem 7 fails if G is infinite, even if G is torsion: consider the unit 
circle group C° of complex numbers of absolute value 1. 

If F' is a finite field, a generator for F* is called a primitive element for 
F’. Hence, Theorem 7 asserts that every finite field has a primitive element. In 
particular, Z, has a primitive element for each prime p, called a primitive root 
modulo p. This fact is important in number theory. More generally, if a (possibly 
infinite) field F has a multiplicative subgroup G of order n, a generator of G (which 
exists by Theorem 7) is called a primitive nth root of unity in F. For example, 
e?ti/” is a primitive nth root of unity in C for each n > 2. 

If F is a finite field of characteristic p, the existence of a primitive element wu in 
F implies that F = Z,(w); in other words, F is a simple extension of Z,. We record 
this fact for future reference. 


Corollary 1. GF (p") = Z,(u), where u is any primitive element for GF'(p”). 


6.4. Finite Fields 303 


Corollary 2. If p is a prime and n> 1 is an integer, there exists an irreducible 
polynomial of degree n over Zp. 


Proof. Write F = GF(p") and let F* = (u) by Theorem 7. Here, FD Zp, as usual, 
so let m be the minimal polynomial of u over Z,. Then m is irreducible and, as 
F =Z,(u), degm = [Z,(u) : Zp] =[F : Zp] =n. a 


Theorem 7 casts new light on the description in Theorem 5 of every subfield 
K of F = GF(p”). First, K is uniquely determined by its order because K* is a 
‘subgroup of the cyclic group F* (and so is unique of its order). Now let u be a 
primitive element for Ff so that F* = (u), where o(u) = p* —1. Because K* is a 
subgroup of F*, it has the form K* = (u?), where d divides p™ — 1. Hence, u? is 
a primitive element for A. Moreover, as |K| = p™, where m|n (by Theorem 5), we 
have p™ — 1 = |K*| = |F*|/d. Hence, - 


pr -i 
ae wrt (where m|n) and K={0}U (ue). 


This gives a complete description of the subfields K of F in terms of the divisors 
m of n and a primitive element u for F. 


Exercises 6.4 


Throughout these exercises, / denotes a field. 


1. Find a primitive element for 
(a) Zu (b) Zas (c) GF(8) (d) GF(9) 
2. Construct a field of order 27 and find a primitive element. 
3. Explain why Ze{z]/(p) and Ze[x]/(qg) are isomorphic if p=a2?+a?+1 and 


q=xeit+at+l. 
4. If p is a prime, draw the subfield lattice of 
(a) GF(p””) (b) GF(p*°) (c) GF(p*) 


5. Find a primitive element of GF(16) and use it to write down all the subfields. 
6. Find a primitive element of GF'(32) and use it to write down all the subfields. 
7. Let E D F be fields. If F is finite, show that EF = F(u) for some u € E. 
8. Find [GF(p") : GF(p™)], where m|n. 
9. Describe all the finite subgroups of C*. 
10. If G and H are subgroups of F* of order n, show that G = H. 
11. Show that each element a of F = GF(p”) has a pth root in F; that is, a = b? for 
some b € F, 
12. Let F be a field in which F" is cyclic. Prove that F' is finite. 
_ 13. If HD Z, is a field and u € EF is a root of f € Z,[x], show that u? is also a root. 
[Hint: Frobenius automorphism.] 
14. Let F be a finite field of characteristic p. If u is a primitive element for F’, show that 
uP is also a primitive element. 
15. Show that 2? +41 is irreducible over GF(2”) if n is odd. [Hint: If u is a root, 
compute u?" for each k > 11] 
16. Prove (4) of Theorem 2. 
17. Let f be a nonconstant polynomial in F[z]. Show that f has no repeated root in 
any splitting field over F if and only if f and f’ are relatively prime in F[z]. 


304 6. Fields 


18. (a) Show that a monic irreducible polynomial f in F{z] has no repeated root in any 
splitting field over F if and only if f’ #0 in F[z]. 

(b) If char F = 0, show that no irreducible polynomial has a repeated root in any 
splitting field over F. 

19. If char F =p, show that a monic irreducible polynomial f in F[z] has a repeated 
root in some splitting field if and only if f = g(x?) for some g € F[z]. [Hint: Exercise 
18.] 

20. Show that no finite field F is algebraically closed. [Hint: Apply Exercise 17 to 
f =a2%! 4.1, where g = |FI.] 

21. Let p be a prime and write f = «2? — x—1. Show that the splitting field of f over 
Zy is Zp(u), where u is any root of f. [Hint: Compute f(u+a), a € Zp.] 

22. (a) Let f be a monic irreducible polynomial of degree n in Z,[z]. Show that f di- 
vides z?" — a in Z,[z]. [Hint: First work over Z(u), f(u) = 0. Use the uniqueness in 
Theorem 4 §4.1.] 

(b) Show that the degree of each monic irreducible divisor f of 2?” — x is a divisor 
of n. [Hint: Theorem 5.] 
(c) Factor 2° — x into irreducibles in Zo[z}. 

23. If F is a finite field, show that every element of F is the sum of two squares. [Hint: 
Given a € F, show that X = {u? | ue F} and Y = {a—u? | u € F} each have more 
than 5|F| elements.] 


6.5 GEOMETRIC CONSTRUCTIONS 


Geometry is the only science it hath pleased God to bestow on mankind. 


—Thomas Hobbes 


The ancient Greeks were good at geometry. However, unlike the analytic geometry 
of today, which makes extensive use of coordinate systems, the Greeks preferred syn- 
thetic methods, such as dropping perpendiculars from a point to a line, intersecting 
lines and curves, and the like. In particular, they were interested in constructions 
using only compass and straightedge (with no marks on the straightedge). Thus, 
they allowed drawing lines through two given points, drawing circles with a given 
center and radius and finding points of intersection of these curves. 

For example, the usual method of bisecting an angle uses only these methods. 
It may come as a surprise that the ancient Greeks were not able to answer the 
following questions. 


(1) Can any angle be trisected using only compass and straightedge? 


(2) Can any cube be duplicated using only compass and straightedge? That is, 
can a cube be constructed whose volume is twice that of a given cube? 


These questions remained unanswered until the nineteenth century, when algebraic 
methods were applied. The answer to both questions is no, as we demonstrate in 
this section. It is worth noting that, well into the twentieth century, hundreds of 
people claimed to have solved one of these problems, and some have even gone so 
far as to publish their “solutions.” 

To systematically analyze these questions, the idea of a constructible real 
number is essential. Suppose that a line segment of finite length is defined to be one 


6.5. Geometric Constructions 305 


unit in length. Then a real number a is called constructible if a line segment of 
length |a| can be constructed from the unit segment in a finite number of steps using 
only a compass and straightedge. Note the immediate implication that a number 
a is constructible if and only if —a is constructible. In fact, we are going to prove 
that these constructible numbers form a subfield of R. The essence of the proof is 
the following Lemma. 


Lemma 1. If a > 0 and b > 0 are constructible, then so area +b, a—b (ifa>b), 
ab, b/a (ifa #0), and fa. 


Proof. With the compass, a copy of any finite line segment can be constructed on 
any given line with any given point as either endpoint. Placing segments end to 
end shows that a+ 6 is constructible. Similarly, a — b is constructible if a > b. 


* S 


O B Cc (6) D B oo R Q 

The diagram on the left-hand side shows the construction for ab (where a > 1). 
Let two nonparallel lines intersect at O and let OJ, OA, and OB be segments as 
shown, with lengths |OZ| = 1, |OA| =a, and |OB| = b. Using only compass and 
straightedge, construct the line through A parallel to 7B and let C be the point of 
intersection of this line and the line through O and B. Then the similarity of the 


triangles OB and OAC gives 
|OB| _ |OCc | es b = |OC} 
|OI|  |OA| 1 a 


Hence, ab = |OC| is constructible. The same argument works if a < 1. 

The proof that b/a is constructible involves the same setup (middle diagram), 
except that now the line through J parallel to AB is constructed. Because a + 0, 
this line meets the line through O and B at D, say. Then the similarity of OID 
and OAB gives |OD| = b/a, so b/a is constructible. 

Finally, to show that ,/a is constructible, consider a semicircle with diameter 
PQ of length a + 1 (right-hand side diagram) and let R be the point of PQ such that 
|PR| = a. Using a compass and straightedge, construct the line through R perpen- 
dicular to the diameter and let this line meet the arc of the circle at S. It is a theorem 
of geometry that the angle PSQ is a right angle, so the triangles PSR and SQR 
are similar. Hence, |S'R|/|PR| = |RQ|/|SRI, that is, |SR|/? = |PR||RQ| =a-1=a. 
Thus, /a = |SR| is constructible. EB 


Although Lemma 1 deals only with nonnegative constructible numbers, the 
reader can now easily supply the proof of Theorem 1. 


Theorem 1. The set of all constructible numbers is a subfield of R. 


Note that every rational number is constructible. 

With Theorem 1 in hand, we attack the Greek construction problems as follows. 
We begin by showing that the minimal polynomial over Q of every constructible 
number has degree a power of 2. Then we prove that a given construction is 


/ 


306 6. Fields 


impossible by showing that it would allow the construction of a number with 
minimal polynomial having degree not a power of 2. As we shall see, this latter 
step is quite easy for the two Greek questions mentioned earlier, so we turn to the 
algebraic condition on the constructible numbers. 

Let C' denote the field of constructible numbers. If a € C, then a is the distance 
between two points in the plane that have been obtained by a finite series of compass 
and straightedge constructions, beginning with points with rational coordinates. 
Hence, we consider an arbitrary subfield F' of C and investigate the nature of the 
points obtained by a single compass and straightedge construction, beginning with 
points whose coordinates lie in F (called F-points for short). The straightedge 
provides lines through pairs of F’-points and the compass provides circles centered 
at F-points with radius in F (called F-lines and F-circles, respectively). The only 
way to construct new points is as points of intersection of two F-lines, of an F-line 
and an F-circle, or of two F-circles. We can easily verify (Exercise 2) that the 
equations of F’-lines and F’-circles have the following form: 


F-lines: ax + by =c, a,b, and cin F, 
F-circles: 2?+y*?+ax+by=c, a,b, andcin F. 


It follows that if two F-lines intersect, the point of intersection is an F-point 
(Exercise 2). However, finding the intersection points (x,y) of an F-line and an 
F-circle (if they exist) leads to a quadratic equation for x or y with coefficients 
in F. Hence, by the quadratic formula, x and y lie in an extension F(./a) of F, 
where a € F and a> 0 (Exercise 2). Finally, the intersection points (if any) of 
two F-circles can be obtained as the intersections of one of the circles with an F- 
line (the one through the points in question). Hence, in all cases, a compass and 
straightedge construction beginning with F-points leads to F(./a)-points, where 
a€F, a> 0. Observe that /a is constructible by Lemma 1, so F(,/a) C C by 
Theorem 1. Finally, note that 


[PG/a) 2) = 2-00, 


depending on whether 2? — a is irreducible or not in F[z]. 

Now suppose that a is any constructible number. Then |a] can be constructed 
as the distance between two C-points P and Q, where C is the field of constructible 
numbers and where these points are obtained by a series of compass and straight- 
edge constructions beginning with Q-points. By the preceding discussion, the first 
of these constructions produces F-points, where F, is a field, C D Fi, D Q, and 
[F, : Q| = 1 or 2. Then the second construction yields F2-points, where F» is a field, 
CDF, DF, and [Fh : Fj] =1 or 2. The process continues to create a chain of 
fiedds Q= Fo CF, C Fb C--: CC. Suppose that m—1 constructions are needed 
to obtain P and Q, so that P and Q are Fj,_1-points. Because a is the distance 
between P and Q, the distance formula shows that a € Fin_1(Wb), where b € Fin—1 
and b > 0. Writing Fin = Fm-1(Wb), this means that [Fin : fm—1] = 1 or 2. Hence, 
we have constructed a finite chain 


Q=Fro CRM CFoC:::CFy-1 C Fn 


of fields where [Fy : Fy-1] =1 or 2 for each k and where a€ F,,. Now the 
multiplication theorem (Theorem 5 §6.2) gives 


[Pm : Q) = (Pn: Fm-i]-+- (Fo: Ful[Fi : Fol, 


6.5. Geometric Constructions 307 


so, as each [Fy : Fy_1] =1 or 2, [Fn : Q| is a power of 2. But a € Fi, implies 
that Q(a) C Fim, so the multiplication theorem implies that [Q(a) : Q] is a power 
of 2 (being a divisor of [F, : Q]). Because [Q(a) : Q] is the degree of the minimal 
polynomial of a over Q (Theorem 4 §6.2), this condition proves 


Theorem 2. If a is a constructible number, then [Q(a) : Q] = 2" for some k > 0. 
In particular, the minimal polynomial of a over Q has degree 2". 


Theorem 2 implies that every constructible number is algebraic over Q. Moreover, 
the argument leading to Theorem 2 actually provides a characterization of the 
constructible numbers: A real number a is constructible if and only if a chain 
Q=hCRChmhC-::-CF, of subfields of R exists such that a€ F,, and 
[Fy : Fei] = 1 or 2 for each k (Exercise 8). 

Theorem 2 provides the means to easily settle the classical construction 
questions posed at the beginning of this section. 


Corollary 1. It is impossible to duplicate a cube of side 1 using only a compass 
and straightedge. 


Proof. If it were possible, then a cube of volume 2 could be constructed. A side of 
this cube has length %/2, which would mean that 4/2 is constructible. But x? — 2 
is irreducible over Q (by the Eisenstein Criterion, Theorem 8 §4.2), so it is the 
minimal polynomial of /2. Because the degree 3 is not a power of 2, Theorem 2 


shows that 4/2 cannot be constructed. ‘@ 
Corollary 2. It is impossible to trisect 7/3 by compass and straightedge.“ 
Proof. Write a = cos (m/9). If trisection of 1/3 

were possible, the right triangle in the figure : 

(with hypotenuse of length 1) could be cons- eo) 4 
tructed, and hence a would be a constructible a 


number. We show that this is not so by proving 

that the degree of a over Q is 3. To accomplish this, recall the trigonometric identity 
cos 30 = 4.cos* 6 — 3cos@ (see Exercise 13, Appendix A). If we take 6 = 7/9, this 
becomes 4 5= = 4a3 — 3a. Hence, a is a root of m = 8x3 — 6x — 1, which is reducible 
over Q by Theorem 1 §4.2 because it has no root (by Theorem 9 §4.1). Thus, $m 
is the minimal polynomial of a over Q, and so the degree is 3 as asserted. |_| 


Another famous problem of the ancient Greeks is whether it is possible using 
only compass and straightedge to square a circle—that is, to construct a square with 
area equal to that of a given circle. This too is ruled out by Theorem 2, together 
with a result of Ferdinand von Lindemann. 


Corollary 3. It is impossible with a compass and straightedge to construct a square 
with area equal to the area of a circle of radius 1. 


Proof. The area of such a square would be 7, so the length ./7 of a side of this square 
would be constructible. In particular, \/7 would be algebraic over Q, so 7 would be 
algebraic. But 7 is transcendental over Q by a famous theorem of Lindemann. & 


74 Archimedes showed that if we are allowed to mark the straightedge, it is possible to trisect any 
angle. 


308 6. Fields 


Exercises 6.5 


1. Give a compass and straightedge construction for each of 
(a) A line parallel to a given line through a given point. 
(b) A line perpendicular to a given line through a given point. 
(These are used in the proof of Lemma 1.) 

2. Let F be a subfield of the field of constructible numbers. Show that 
(a) Each F-line has equation ax + by = c, where a,b,c € F. 
(b) Each F-circle has equation z? + y? + ax + by = c, where a,b,c € F. 
(c) The intersection (if any) of two F-lines is an F-point. 
(d) The intersections (if any) of an F-line and an F-circle are F'(,/a)-points, where 
aéFanda>O. 

3. Can an angle of 1/4 = 45° be trisected using only a compass and straightedge? 
Support your answer. 

4, Can an angle of 40° be constructed? Support your answer. 

5. Can a sphere be cubed? That is, can a cube be constructed whose volume equals 
that of a given sphere? Support your answer. 

6. Can a cube be tripled? That is, can a cube be constructed whose volume is three 
times that of a given cube? Support your answer. 

7. (a) Show that sin @ is constructible if and only if cos @ is constructible. 
(b) Show that cos 26 is constructible if and only if cos 6 is constructible. 

8. Show that a real number a is constructible if and only if a finite chain of subfields 
Q=hRyCF,C--- CF, of R exists with a € F,, and [F, : Fy-1] = 1,2 for each k. 

9. Show that a regular heptagon (seven-sided polygon with vertices equally spaced on 
a circle) is not constructible with a compass and straightedge. [Hint: 6407 — 1122° + 
56a° — 7x — 1 = (84° + 4x2? — 4x — 1)(824* — 40° — 82? + 32 +1). 


6.6 THE FUNDAMENTAL THEOREM OF ALGEBRA 


The fundamental theorem of algebra is the assertion that the field C of complex 
numbers is algebraically closed. This result was first proved by Gauss in his Ph.D. 
dissertation, and many proofs of this result are now known. However, no proof is 
entirely algebraic in nature; that is, each proof involves some analytic property of 
polynomials. The proof we give uses only one nonalgebraic fact: 


If a polynomial f in R|z] has odd degree, then f has a real root. 


This fact, known to every calculus student, depends on the continuity of f when 
regarded as a function R — R. Because f has odd degree, there are real numbers a 
and b such that f(a) > 0 and f(b) < 0. The graph of f is a continuous curve, so it 
must cut the z-axis at some value u between a and b. Hence f(u) = 0, and wu is the 
desired real root. 

The algebraic prerequisites for our proof are the existence of splitting fields and 
a result about symmetric polynomials. A polynomial f(x1,...,¢n) in n variables 


“6 This section requires results from Section 4.5 on symmetric polynomials. 


6.6. The Fundamental Theorem of Algebra 309 


is called symmetric if it is unchanged when the variables are permuted; that is, 


Fleer, fea5-< ten) = L(61, G23: «5Sn) for all o 6 Sy. 
Thus, f(v1, 22) = x? + 23 is symmetric, as is f(x1, 2,23) = 212923. If 1<k <n, 
the elementary symmetric polynomial s, = s,(x1,...,2n) is defined to be the 
sum of all possible products of k of the variables 71,...,£,. More formally, 
$i. Sp (ssi 2a) = in cigencs tin Bia + Dips 

We define so(z1,...,2n) = 1. Hence, for example, 

81 (#1, 22,+.+;Bn) = X + 2X2 + ++ +2n, 

S(t Reo we Gln = ies URS 


$9(@1, 22, 03,24) = ©1122 + 1123+ 21%4 + Lox + Lot4 + LaLa, 
83(@1,©2, 03,24) = L1L2L3 + Li LyL4 + 11H 3Lq + LoTgk4. 
The importance of the polynomials s,% = sx(%1,%2,...,%n) for the splitting of 
polynomials lies in the fact that 
(x — 21)(@ — 42) +++ (4% — En) 
=o" — sy 1 4 son? 2 4-0. 4 (-1) 1 5n_10 + (-1)" Sn. 


If f(a1,...,2n) is any polynomial, it is clear that f(s1,82,...,$n) is symmetric. 
The remarkable thing is that the converse holds (Theorem 4 §4.5). 


Theorem 1. Fundamental Theorem on Symmetric Polynomials. Every 


symmetric polynomial over a ring R is a polynomial f(s1,52,...,5n) over R in the 
elementary symmetric polynomials s1,-++ , Sn. In fact, this representation is unique. 


For example, ifn=3, the symmetric polynomial x? + x3 + 23 has the representation 
x? + 02 +22 = (a1 + 29 +23)" — 2(r1 22 + 2123 + Foz3) = 8} — 2s. 
Moreover, this result holds for any number of variables as the reader can verify. 


Other examples are given in Section 4.5. 


Theorem 2. Fundamental Theorem of Algebra. The field C of complex 
numbers is algebraically closed. 


Proof. We show that a nonconstant polynomial f in C[z] has a root in C. First, we 
show that it suffices to prove this property for real polynomials. If f is obtained 
from f by conjugating every coefficient, then g = f f has real coefficients, as is easily 
verified, and if g(u) =0, weé C, then either f(u) =0 or 0 = f(u) = f(a). Hence, 
u or & is a root of f. 

So let f be a nonconstant polynomial in R(x] and write deg f = d = 2m, where 
m, is odd. We show that f has a root in C by induction on n > 0. Ifn = 0, then f 
has odd degree and so has a root in R. Ifn > 1, regard f as an element in C{z] and 
let EF D> C be a splitting field for f. Hence, 


f =a(x —u1)(@ — u2)-+(e@—uag), where a € R and each u; € E. 
It suffices to show that u; € C for some 7. We have 
f er axz?—as;(u,: yaa * oF @.82(u1,° . -ua)et? ica ss (-1)4a Sa(ua,- : +, Ud): 


Hence, s,(u1,::: , ua) € R for each k = 1,2,--- ,d because f € R{z}. 


310 6. Fields 


Given 1 < h € Z, consider the following polynomial in R[z][z1,--- , xa]: 
. ee = —7,—-2;—hrer: * 
Byes v1, ,&q) sae Li Ly ha4z;). ( ) 
For fixed z, (*) is a symmetric polynomial in the variables 1, t2,:++ ,wq and so, by 
Theorem 1, it is a polynomial in s1(x1,--- ,2a),+++ ,Sa(21,+++ , 2a) with coefficients 
in R[z]. Because s;(u1,--: ,Ua),+:+ ,8a(u1,*-: , Ug) are in R, this means that the 
polynomial f;,(z) = fr(Z}u1,:++ ,Ua) is in R[z]. Moreover, 


deg( fn) = (2) = 4d(d — 1) = 2""'m(2"m - 1), 


so, by induction, f;, has a root in C for each h > 1. Hence, (*) implies that, given 
h>1, ustu;+huju; € C for some 7 and J, with 1<i<j<d. As the number 
of such pairs (4, 7) is finite, integers h # h’ exist such that both uw; + u; + huju; and 
uy tu; +h'uju; lie in C. Then both u;+u,; and uju,; are in C, so @—u,)(e—u,) €C[z]. 
But this polynomial splits in C by the quadratic formula, so u; and u; are in C, 
Because they are roots of f, the proof is complete. Hl 


A closer scrutiny of the proof of Theorem 2 reveals that we have proved slightly 
more: Let CD F be fields and assume that C = F(i), where i? +1 = 0, and that 


(1) F has characteristic 0. 
(2) Each element of C has a square root in C. 
(3) Each polynomial in F[z] of odd degree has a root in C. 


Then C is algebraically closed. 


6.7 AN APPLICATION TO CYCLIC AND BCH CODES 


There is no branch of mathematics, however abstract, which may not someday be applied 
to phenomena of the real world. 


—Nicolai Ivanovich Lobachevski 


We introduced coding theory in Section 2.11, where we discussed binary linear 
codes. Recall that the direct product of n copies of Zz is denoted B” and that 
elements of B” are called words and written as strings of 0’s and 1’s (called 
bits). Thus, B? = {00,01, 10,11}. In general, B” is an additive group of order 
2”. A subgroup C C B” of order 2*, k #0, is called a binary linear code or an 
(n, k)-code for short. In this section, we discuss an important class of linear codes, 
called cyclic codes. These codes are useful because they can be implemented by 
a simple electronic circuit called a feedback shift register (discussion of which is 
beyond the scope of this book). Our interest in these codes is twofold: (1) their 
analysis provides an application of the theory of rings, polynomials, and fields and 
(2) they include the so-called BCH codes—one of the most widely used classes of 
error-correcting codes. 

IfC C B®” is a code, a word in C is denoted aga ,a2++:an_1, where the bits a; 
are in Zz = {0,1}. The reason for this choice of subscripts will soon be apparent. 


6.7. An Application to Cyclic and BCH Codes 311 


A code is called cyclic if it is closed under cyclic shifts; that is, if agayaq-+-an—1 
is in C, then an_1@94142 -++Gn_2 is also in C. 

Example 1. {000,111} and {000, 110,011, 101} are cyclic by inspection. 
Example 2. The set of words in B” of even parity (even number of 1-bits) is a 
cyclic code of order 2"~!, For n = 4, it is 


C = {0000, 1100, 0110, 0011, 1010, 0101, 1001, 1111}. 


The theory of rings enters the picture as an elegant means of describing these 
cyclic codes. Let x be an indeterminate over Zp and consider the principal ideal 
(1 — 2") of all multiples 1-2” in the polynomial ring Z2[z]. The factor ring is 
denoted 

Zo[z} 


(oF 
Recall (Theorem 2 §4.3) that the ring B, can be described as 


By = 


By = {a9 tat + +++ +a@n-1t"! | a; € Ze, t” = 1}. 
The operations in B, are the same as for polynomials, except that t” = 1. The map 
9: Zz] + By, given by O(f) = f(t) 
is an onto ring homomorphism with 0(7) =t and ker @ = (1 — 2"). Hence, 


Bn ={f(t) | f € Zola}. 


Moreover, {1,t,-:- ,¢" 1} is a basis of B,, as a vector space over the field Ze, so the 
additive groups B, and B” are isomorphic via the correspondence 


ag +ayetere+ att > 081 °** An-1- 
For example, if n = 5, some typical correspondences are 


1+2#2+#3 corresponds to 10110, 
1 corresponds to 10000, 
14+¢+2#?+#3+24 correspondsto 11111. 


Because of this isomorphism, we think of codes as additive subgroups of B” or Bn. 
We call these the word form and the polynomial form of the code, respectively, 
and use both points of view in this section. The word form of a code is useful 
when matrix multiplication is used for encoding (see Section 2.11). However, the 
polynomial form of a code has the advantage that the extra ring structure of B, is 
useful for describing cyclic codes. 

Indeed, if C C By, is a code and f(t) = a9 + ait +++: +G@n_1t”} is an element 
in C, the cyclic shift of f(t) is 


Qn—1 + aot + ayt? +--+ +ay_2t™? = tf (t) 
using the multiplication in B, and the fact that t” = 1. Hence, 
C is cyclic if and only if tc CC, 


where, of course, tC’ = {tf(t) | f(t) € C}. But because C is an additive subgroup 
of B,, the condition tC C C means C is an ideal of the (commutative) ring By. 


312 6. Fields 


This is wonderful news because the ideals of B, are easy to describe. Recall the 
onto ring homomorphism @ : Z2[z] — B, given by 6(f) = f(t) for f € Ze[z], with 
ker @ = (1 — 2"). If C is an ideal of B, define 


A={f € Zp{z] | f(t) € Ch. 


It is routine to verify that A is an ideal of Zp[z] such that ker@ C A and that 
C' = 6(A). But A is a principal ideal because Zz is a field. More precisely, Theorem 
1 §4.3 shows that if g is a nonzero polynomial in A of minimal degree, then g is 
uniquely determined by A (it is automatically monic because the field is Z2), and 


A= (9) = {a9 | 4 € Zo[z]} = Za[zI]9. 


Moreover, ker @ C A means that (1 — 2") C (g), so g is a divisor of 1 — x” in Zp(z}. 
Hence, every ideal C of B, has the form 


C = (A) = (9(t)) = {a(t)o(t) | a(t) € Br} = Brg(t), 


where g divides 1 — 2”. Theorem 1 summarizes this discussion. 


Theorem 1. The following conditions are equivalent for a code C C By: 
(1) C is cyclic. 
(2 HOE ©, 
(3) C is an ideal of the ring By. 

In this case, a divisor g of 1 — x” exists in Z2[x] such that 


C = (9(t)) = {a(t)g9(t) | a(t) in Bn} = {f(t) | g divides f in Za[x}}. 


Moreover, g is the unique polynomial in Z|x] of lowest degree such that g(t) € C. 
Finally, if (f(t)) is another such code, where f divides 1— 2x”, then (g(t)) C (f(t)) 
if and only if f divides g in Ze[x]. 


Proof. Only the last statement remains to be proved. Suppose that (g(t)) C (f(t) 
so that g(t) = g(t) f(t) in B,. Then g ~q f lies in ker @ = (1 ~ 2”). Since f divides 
1— 2”, this implies that f divides g. The converse is clear. _| 


To illustrate Theorem 1, consider again the code 
C={0,14+t,t4+0,?+0,14+2,t+8,140,14+t+4+20} 


in Example 2. It is generated by 1+ ¢ because g = 1+ 2 is the nonzero polynomial 
of least degree such that g(t) is in C. In general, a cyclic code can have more than 
one generator (both t+t? and 1+1° generate C), but there is only one of least 
degree. This unique polynomial is called the minimal generator of C. 

Hence, determining the cyclic codes in B, comes down to identifying all the 
divisors of 1 — 2” = 1+ 2” in Zo{z]. These divisors, in turn, are determined by the 


4 


6.7. An Application to Cyclic and BCH Codes 313 


factorization of 1+ 2” into irreducible factors in Zg|z]. The factorizations for the 
first few values of n are 


1+a? = (14+2)?, 

1+23 = (1+2)(1+2+427), 

i+2t =(1+2)4, 

l+a° = (1+a2)(1+24+272+23 424), 
1+26 = (14+-2)7(1+24 27), 

l+a’ = (1+2)(1+a+2)\(1+2? +23). 


Recall that quadratic and cubic polynomials are irreducible if they have no root in 
Zp, but no such simple test exists if the degree is greater than 3. 

We note in passing that if F is a field, finding the irreducible factors of 1 ~ x” 
in Fz] begins by factoring it as a product of cyclotomic polynomials. We do 
so for F = Q in Section 10.4, where we show that the cyclotomic polynomials 
themselves are irreducible. However, if F' = Zg, each cyclotomic polynomial factors 
into irreducible polynomials of the same degree. Discussion of this topic is beyond 
the scope, of this book.”6 . 


ll 


Il 


Example 38. Recalling that 2 = 0 in Ze, we get 1+ 2* = (1+2)4 in Zo[z]. Hence, 
the divisors of 1+ 24 are 1,1+2,(1+2), and (1+2)%. Because (1+ 2)? =1+2? 
and (l+2)> =1+a2+427+2°%, the cyclic codes in By are 
(1) = Ba, 
(1+t) ={0,1+¢,t4+0,P+0,14+08,14+0,t408,14¢4+P4+8}, 
(i+?) = {0,14+07,¢+0,1+¢+42?4 t}, 
(L+¢+2? +49) = {0,1 +¢+27 + 67}. 
Note that (1+) corresponds to the code in Example 2. 


Example 4. For any n, 1—a” =1+a"=(1+2)(1+e+---+2™1) in Zo[a]. 
Hence, there are always two cyclic codes: 


e (1+) is the ideal of polynomials of even parity (coefficients sum to 0). 

0 (1 ttt epi?) = {01+ t4+--- +e}. 
Example 5. We have 1—~2®> =14+2°=(1+a)(l+a+a?+23+24), and 
l+a2+a? +23 +27 is irreducible (Exercise 10). Hence, Bs has three cyclic codes: 

(1) = Bs, 
(1 +t) = the polynomials of even parity, 
(l+¢+P? +0404) ={0,14¢4+7 40404}. 

A code C C By, is a Zg-subspace of B,, so we may speak of the dimension of C' over 
Zo. We write it as dimz, C' or simply dim C when no confusion can result. Then 


|C|=2*, where dimC=k. 
We now let C = (g(t)) be a cyclic code, where g is a divisor of 1-2" =1+ 2”. If 


m = deg g, we are going to show that dim C = n—m and hence that |C| = 2"-™. 


“6See, for example Lidl, R. and Neiderreiter, H., Introduction to Finite Fields and Their Appli- 
cations, Cambridge, England: Cambridge University Press, 1986, Section 2.4. 


314 6. Fields 
Recall some ring theoretic notation: If R is a commutative ring and a € R, the set 
anna={réeR|ra=0} isan ideal of R 


(called the annihilator of a in R). These ideals play a basic role in Bp. 


Lemma 1. Let g be a divisor of 1— x” in Ze{z], say 1-2” = gh. Then, in the 
ring B,, ann g(t) = (h(t)). 


Proof. We have h(t)g(t) = 1—t" =0, so h(t) € anng(t). Hence, (h(t)) C ann g(t). 
Conversely, if f(t)g(t) =0, then fg € kerO =(1—2"), say fg=q(1—2"). As 
1-2” = hg, it follows that f = qh. Hence, f(t) € (h(t)), as required. | 


Theorem 2. Let C = (g(t)) be a cyclic code in B,, where g divides 1— x” in 
Z2[t], and write m = deg g and k = n—m. Then 


X = {g(t), to(t), t79(t),-++ ,t* *g(t)} 
is a Z-basis of C. In particular, |C| = 2* = 2"-™, so C is an (n, k)-code. 


Proof. It suffices to prove that X is a Zp-basis of C. Write 1— a” = hg so that 
degh = n —m = k. Given an element f(t) of C, say f(t) = q(t)g(t), writeg=ph+r 
in Zg[z], where r = a9 + a1z +++ + ax-12""1, a; € Zo. Then h(t)g(t) =0 by Lemma 
1, so f(t) = r(t)g(t). Hence, X is a spanning set for C. To see that X is linearly 
independent, suppose that 

. agg(t) + artg(t) + +++ + ag—it*19(t) =0, a; € Zo. 


Write f = ao + az +--+ + ax-12*71; it suffices to show that f = 0 in Za[x]. But 
f(t)g(t) = 0, so f(t) € (h(t)) by Lemma 1. Hence, h divides f in Z2[z] by Theorem 
1, which means that f = 0 (otherwise, k = degh < deg f < k—1). a 


Matrix Description 


The linear codes in Section 2.11 were described using binary matrices. As already 
noted, B” is an n-dimensional vector space over Zz, and an (n,k)-code C C B” is 


nothing but a k-dimensional subspace. If {wo, wi,--+ , Wei} is a basis of C, let 
Wo 
Wi 
er 
Wk-1 


wo 
Wi 
uG=[a9 a1 <** apa] . = AgWo + G1 Wy + +++ + Gp-1Wk-1) 
Wk-1 
so, as C’ is spanned by the words wo, wi,:-> , We-1, we have 


C= {uG | ue B*}. 


6.7. An Application to Cyclic and BCH Codes 315 
Hence, as in Section™2.11, G is called a generator matrix”’ for the code G. Simi- 
larly, if H is an n x (n —k) matrix such that f 


C= {w € B” | wH = 0}, 


then H is called a parity check matrix for the code C. Both methods of describing 
C are useful, and we can easily find such matrices if C is a cyclic code. 

In fact, let C = (g(t)) be a cyclic code in B,, where 1— 2" = gh in Zo[z]. 
Write degg =m, degh =k, so that n= m-+k. Then g(t) and A(t) give rise to 
a generator matrix G and a parity check matrix H for (the word form of) C. If 
g(t) =gotgit+-::+ gmt™, we define a k x n binary matrix 


g(t) go 91 92 + gm OO ++ 0 
tg(t) 0 go 91 *** Qm-1 Qm **' 0 

G= : POS hy dy = ; : (*) 
tl g(t) O s+ 0 go gr ot ot gm 


where the rows of G are the first k cyclic shifts of the coefficients of g(t). Given a 
word u = ao@1--+-a,4—1 in B*, and being somewhat facile with the notation, we get 


g(t) 
tg(t) 
uG = [a9 a, ++: aK-1| = (ap + ait +++: + ap_it**) g(t). 


tk-19(t) 


Hence, G is a generator matrix for (the word form of) C' = (g(t)). 

Lemma 1 (with g and h interchanged) shows that C = {f(t) | f()A(t) = 0}. 
Hence, not surprisingly, a parity check matrix for C comes from A(t) in a similar 
way. If h(t) = ho + hit +++: + hgt*, define the n x m matrix 


0 0 hr 
0 Rp 
hy Rev hy 
Teil. 4k 
fe eal ae) 
he hy 0 
hi ho 
ho 0 0 


where the columns are the first m cyclic shifts (bottom up) of the coefficients of 
h(t). The proof that H is a parity check matrix depends on Lemma 2. 


77We do not insist that the first k columns of G form the k x k identity matrix; that is, we do not 
insist that G is a standard generator matrix for C (see the definition preceding Theorem 6 §2.11). 
This restriction is not severe (Exercise 19). 


316 6. Fields 


Lemma 2. If G and H are as in (*) and (**), then GH = 0. 
Proof. As might be expected, the reason is that g(t)h(t) =0 in B,. Write 


noi : n-1 : 
g(t)= igit® and rA(t)= ¥ hit’, 
i=0 i=0 
where gm4i =+'' = 9n-1 = 0 and Aggi =-+: =hp_-1 = 0. As t” = 1, the coefficient 
of ¢? in the product g(t)h(t) = 0 is 
Gop + gihp-1 +++ + 9pho + Gppthn—1 +++ + Gn-1hpyi = 0. 
Taking subscripts modulo n, this expression can be compactly written as 


y gh; =0 for all p=0,1,--- ,n—1. 
it+j=p 


Now the matrix product of a typical row of G and a typical column of H is 


hgtn-1 
‘ n-1 
[9p 9pt1°** 9p+n-1) = Iptmhgtn—m-1 = ys gihj =0 
Agi m=0 i+j=ptqtn-1 
hq 
by the preceding equation. Hence GH = 0. | 


Theorem 3. Let C' = (g(t)) be a cyclic code (that is, a nonzero ideal) in the ring 
B,, and let 1-2" = gh in Ze[z], where degg =m and degh=k=n-—m. IfG 
and H are as in (*) and (**), then the word form of the code C is given by 


C ={uG|ue BY} = {we B” | wH = 0}. 
In other words, G is a generator matrix for the word form of the code C, and H is 
a parity check matrix for the word form of C. 


Proof. We already know C = {uG' | u € B*}. Write A = {w € B” | wH = 0}. Then 
Lemma 2 shows that CC A, so as |C| = 2* by Theorem 2, it remains to show that 
|A| = 2". To this end, let Co denote the subspace of B” spanned by the m columns 
of H. These columns are independent (in equation (**), hy = 1 because deg h = k), 
so dim Co =m, whence |Co| = 2”. On the other hand, consider the orthogonal 
complement Cz of Co, defined by 


Cet = {we B" | wez=0 for all z € Co}, 


where wez denotes the dot product. If {wi,we,-+- ,Wm} is a basis of Co and 
B=[wi we +++ Wm), then Cp = {w € B” | wB=0} = {we B” | Blut =}, 
so a basic theorem of linear algebra shows that 


dim C+ = dim B® — dim Cp = n— rank BP =n—m=k. 
The proof is completed by the observation that 
A= {we B” | wez=0 for all columns z of H} = Cd. | 


"8If nullA = {X | AX =0}, then dim(null A) =n—- rankA. Note that the usual proof that 
dim Cg = n— dim Cp breaks down because w ew =0 can happen with w #0 (char B, = 2). 


6.7. An Application to Cyclic and BCH Codes 317 
Example 6. We have 1- a’ = (1+2+2%)(1+2+ 2? + 2+). Hence, take n = 7, 


m = 3, and k =4, using g(t) =1+t+¢3 = 1101000 and A(t) =14t+ 44 = 
1110100. Then 


and H= 


oo oF 
oo KF rH 
OoOrro 
Pr Oo 
HOF oO 
oro °o 
nn) 
HERP rH OF © © 
OFF FP OF Oo 
OoOOrFRFy, FP OF 


This is the Hamming (7, 4)-code discussed (with a slightly different notation) in 
Examples 9 and 12, §2.11. 0 


Example 7. For any n > 2,1—2" = (1+ 2)(1l+2+a?+---+ 271), and the code 
(1 + t) consists of the polynomials of even parity. Here, g(t) = 1+t=1100---0 and 
A(t) =1+¢t+---+¢%1! =111---1. Hence, 


1100+. 00 
0110. 00 
001i. 00 
G= : and H= 
000 1 
0 0 0 11 
In this case, H is obviously a parity check matrix for the code. O 


Error Detection 


So far we have paid no attention to the error detecting and correcting capabilities of 
a cyclic code C’. They depend on the minimum distance d of C,, that is, by Theorem 4 
§2.11, on the minimum weight of a nonzero code word in C. Here, the weight, wtc, 
of a word ¢ in C is the number of 1’s occurring as bits in C. Theorem 4 gives 
a lower bound on d, which is useful in constructing some important cyclic codes. 

Theorem 4 involves the following notion. Let F’ be any field that contains Z, (for 
example, any Galois field GF(2?)). Given n > 1, an element ¢ of F' is a primitive 
nth root of unity over Ze if it has order n in the group F* of nonzero elements of 
F. Hence, (” = 1 but 1,¢,¢€7,--- ,¢"71 are all distinct. Note that m must be odd 
because n = 2m gives 0 = ¢" —1 = (€™ —1)?, so ¢™ =1. Observe that 


oe” —1=(#-1)(@—¢)(e—¢?) + (@-— C7") 


because 1,¢,¢7,--- ,¢"7} are all roots of z” —1 in F and they are distinct (¢ is 
primitive). Hence, every divisor g of z” —1 (and so the generator of each cyclic 
code) is a product of terms (a — ¢*). In particular, the roots of g in F are all powers 


of ¢. 


318 6. Fields 


Theorem 4. Let C = (g(t)) be a cyclic code in By. If ¢ is a primitive nth root of 
unity over Zz, assume that t consecutive powers of ¢ are roots of g, say 


9(6?) = (CPt?) = = (GP?) = 0. 
Then d >t+1, where d is the minimum distance of the code C. 


Proof. Let f(t) = fot fit+-::'+fpt™* be an element of C and write the 
corresponding word as f = fofi-::: fn—1. It suffices to show that wt f >t+1. Now 
f =4g for some g in Za[z] by Theorem 1, so f(¢°+*) = 0 for 0 <i <t—1. Matrix 
multiplication gives 


1 1 pees 1 
‘ag cork ae cbtt-1 
2b 2(b+1) hs 2(b+t-1) 
fe ee Fae oe ¢ Se =(00.--- Oj. 
cln-1)® ¢(n-1)(b+1) ae e(n-ay(64t-1) 


Now suppose that wt f = s <t so that f has exactly s nonzero bits, for instance, 
fi, = fig = +++ = fi, = 1 and f; = 0 otherwise. Hence, only rows 71,%2,:-* ,75 in the 
matrix contribute to the product. Consider these rows and the first s columns of 
the matrix product. The result is 


cb ci1(b+1) bers Ci1(b+s-1) 
ci2b cig(btt) 1... gig(bts-1) 

eu siae ORS Taal il |g =[00-: Of. 
cisb cis(b+1)  ...  cis(b+s-1) 


Hence, this s x s matrix has zero determinant; that is, 


1 faz rate (C1) 8-4 


1 cia La, (¢?2)s-t 
Q= Cibtigbttteb det 


1 cis dass (cis 8-3 


But this is a contradiction because this last determinant is nonzero. (It is a Van- 
dermonde determinant”? and ¢4,¢%,--- ,¢% are distinct because ¢ is primitive.) 
Hence, wt f >¢+1, as required. 


Theorem 4 suggests a way to construct a cyclic code with any predetermined 
minimum distance d: Just choose a generator polynomial having as roots d—1 
consecutive powers of a primitive root of unity. To do so, we recall a notion intro- 
duced in Section 6.2. If F is a finite field containing Zp and v € F, the minimal 
polynomial of v over Zg is the nonzero polynomial m in Ze[z] of least degree such 
that m(v) = 0. Then, if f € Ze[x], we have (Theorem 3 §6.2) 


f(v) =0 if and only if m divides f in Zo[{z]. 


*9See, for example, Nicholson, W.K., Linear Algebra with Applications, 7th ed., Whitby: PWS- 
Kent, 2012, Section 3.2. 


6.7. An Application to Cyclic and BCH Codes — 319 


In particular, minimal polynomials are irreducible, and any irreducible polynomial 
is the minimal polynomial of each of its roots in Za. 

The code is constructed as follows. Let 2<d<n and 0<b be integers, ¢ 
be a primitive nth root of unity over Ze, and m; be the minimal polynomial over 
Zo of ¢*. If 


g= Icm(mp, Mp+1,°°* ,Mp4+d-2); 


then the cyclic code (g(t)) in By, is called the binary BCH code®® of length n 
and designated distance d. Note that because the minimal polynomials mm, are 
all irreducible, g is the product of the distinct polynomials in the list m,,m541,..., 
Mbp4a-2. Of course, two of these polynomials may be equal. 

Theorem 5 collects some basic properties of these BCH codes. 


Theorem 5. Let C = (g(t)) be a BCH code as defined above. 
(1) The minimum distance of C is at least d. 
(2) f(t) € C if and only if f(¢*) =0 fori =b,b+1,-:: ,b+d—2. 
(3) The following matrix H is a parity check matrix for C: 


1 1 ies 1 
ce cbt stick cbtd-2 
Pee ec cHb+1) —... ¢2(b+a-2) 
C(m=2) Elm-1)(HFI) (m1) (4a-2) 
Proof. (1) Because m; divides g for each i, each of €°,¢+1,... ,¢o+4-2 is a root of 


g. Hence, (1) follows from Theorem 4. 

(2) If f(t) is in C, then g divides f, so g(¢’)=0 implies that f(¢*) =0. 
Conversely, if f(¢*) = 0 for each i = bb+1,-+- ,b-+d—2, then m; divides f by the 
definition of m;, and so g divides f by the definition of g. This shows that f(t) is 
in C and so proves (2). 

(3) Given f(t) = fo+ fit +--+ frit”! in B,, write the corresponding word 
as f = fofi:: fr—1- Then 

FHS TC) HOM) os fC] 


so (2) shows that fH = 0 if and only if f(t) € C, which proves (3). a 


Example 8. The polynomial 1 + 2 + 2? is irreducible over Za, so if ¢ is one of its 
roots, we can construct the Galois field F = GF(8) as follows (Theorem 2 §4.3): 


F = {a9 +.a4¢ + a2¢? | a; € Zo, C7 = 14 ¢}. 


The powers of in F are 1, ¢,¢7, @=1+4+¢,C4=C +07, C=14+ 6407, C8=14 0, 
and ¢’=1. Thus, ¢ is a primitive seventh root of unity over Zo with minimal 
polynomial m, = 1+2+ 2°. Moreover, ¢? is also a root of my, (the third root is 
¢*). Hence, as two consecutive powers of ¢ are roots of m (x), Theorem 4 guarantees 


8°Discovered by A. Hocquenghem in 1959 and independently by R. C. Bose and D. V. Ray- 
Chaudhuri in 1960, and hence the name BCH. 


320 6. Fields 


that the BCH code C= (1 +t+ ) in By has a minimum distance of at least 3 
and has a parity check matrix 


1 1 

¢ ¢ 
A= 

cs ci2 


This code is the Hamming (7, 4)-code, and the minimum distance is in fact 3 because 
1+¢+¢° has weight 3. We described it in Example 6 with a different parity check 
matrix. 
Then minimal polynomial of ¢° = 1 is 1+ 2, so the polynomial 
g=(l+2)(l+e+2°)=1+27?+23+24 


has three consecutive powers, ¢°, ¢', and ¢?, as roots. Hence, (g(t)) has a minimum 
distance of at least 4 by Theorem 5 (in fact it is 4). Thus, C' can detect three errors 
and correct one error by Theorem 4 §2.11. 


Example 9. The polynomial 1+ + 2% is irreducible over Zp (Exercise 11). If ¢ 
is a root, we get the Galois field 


F = GF(16) = {ap + a1€ + aC? + a3¢3 | a; € Z, Cai}, 


The powers of ¢ are 


C= Care acre re 
C=? Co=1t¢+ CO CF=1464+ 746 
G=¢3 ¢§—1+¢? C8 —-14+C4¢6 
(alte @=C+@ O4#a1+¢ 


5=¢+(? CO=1+¢6+C Chet 
Hence, ¢ is a primitive 15th root of unity over Za. The minimum polynomial of ¢ is 


my =1+a+24, and both ¢? and ¢¢ are roots, as is easily verified. The minimal 
polynomial of ¢? is my =1+2+27+2°+ 24 (Exercise 10), so 


g=mm=1+atqao+a7 4+ 28 
has ¢,¢7,¢3, and C4 as roots. Both m; and mz divide x!° — 1, so the BCH code 
C=(1+t+t +t" +1) 


is a (15,7)-code, with minimum distance at least 5 by Theorem 5. (The distance 
is 5 because 1 + t* +246 + 2%" + ¢® has weight 5.) Hence, C' can detect four errors and 
correct two errors by Theorem 4 §2.11. 


Example 10. Let F = GF(2°) be the Galois field of order 2*. Then the multiplica- 
tive group F* of nonzero elements of F' is cyclic by Theorem 7 §6.4, say, F* = (¢). 
Writing n = 2° — 1, this means that ¢ is a primitive nth root of unity over Z2. Hence, 
BCH codes with n = 2% — 1 are called primitive. 


As these examples indicate, Theorem 4 is useful for constructing codes, and 
these BCH codes are of great practical importance. For example, the European 
and transatlantic communication system uses a BCH (255, 231)-code that detects 


6.7. An Application to Cyclic and BCH Codes 321 


six errors and has a failure rate of 1 in 16 million. As another example, a BCH 
(128, 112)-code that detects three errors and corrects two errors is used to commu- 
nicate with the INTELSAT-V satellite. 

In addition to Theorem 4, BCH codes are useful because they admit an efficient 
error-correcting algorithm. A complete discussion of this algorithm is beyond the 
scope of this book, but we conclude this section with a sketch of the procedure. 
Given a BCH code C, suppose that a code word c in B” is transmitted and w is 
received with errors in bits a, a2,--- ,a-. Then w = c+ e, where e is the error word 
with polynomial form e = 2%! + 7 +--.+ 24%, The decoder must determine the 
integers a; and then decode w by changing bit a; for each 7. Now w is known and 
so are the quantities s; = w(¢'), i=b, b+1,--- ,b+d—2, where C is a primitive 
nth root of unity over Za. If H is as in Theorem 5, then wH = [sp 8541 +++ 8544-2], 
so Theorem 5 gives w € C if and only if s; = 0 for all 7. In particular, cH = 0, so 
e(C*) = s; for all i because w = c+ e. Thus, 


cia + (i224... 4 Ctr = e((*) = 5, 4=b,b4+1,...,b6+d-—2. 


The idea of the decoding algorithm is to determine the quantities ¢*%: from these 
equations in terms of the (known) s;. They are determined as the roots of the 
error-locator polynomial: 


8 = (w+ C%)(a + (7) ++ (a + ¢). 


Because the roots of s are powers of ¢, we can determine these roots by substituting 
the powers in s one by one. So the real problem is finding the coefficients of s in 
terms of the s;. The main difficulty is that the number of errors r is not known 
even though algorithms for finding it are known.®! We content ourselves with an 
example where r = 2. 


Example 11. The (15, 7)-code C = (1+ ¢4 + ¢® + ¢” + ¢8) in Example 9 can correct 
two errors. Assume that two errors do occur in bits a and b so that e= 2+ 2°. 
Then the error-locator polynomial is 


s= (wt Cat) =a? + (C+ Ca + Cor, 
Now ¢¢ + ¢° = e(¢) = 8; is known; to find ¢?*° in terms of the s;, we compute 
BB = (6% +.CP)8 = C84 +659 + CoFP(Cs + 0?) = 8g + (Atay. 


53 oka 
Hence, C+ = s? + —, so the error-locator polynomial is 
$1 


s=2*+5,04+ (st+2). 
§] 


To illustrate how this procedure works, suppose that c=1+a4+a°+427 +2 is 
transmitted and that w =1+2+2++4 2° + 2’ is received (with errors in bits 1 and 
8). Using the formulas in Example 9, we have 


sy =uw(Q)=14 6464 +08 +07 = 14-640, 
a OD a i es Sl Sel 


81See Williams, F.J. and Sloane, N.J.A., The Theory of Error-Correcting Codes, New York: North- 
Holland, 1977. 


322 6. Fields 


Hence, because sj = ¢+ C? in GF(16), we get s= 27 +(14+¢4+¢7)a4+ (¢4+¢?). 
Then s(¢) = 0 = s(¢°), so the roots of s(x) are ¢' and ¢8, locating errors in bits 1 
and 8. 


Exercises 6.7 


1. (a) Show that (fi: + fe+-:++fn)? =f? + fZ++:-+ f? for all f; in Zo[z]. 
(b) Show that f(a)? = f(x?) for all f in Zg[z]. 

(c) If f € Zea], show that f’ = 0 if and only if f = g(x)? for some g in Zo[{z]. Here, 
f' is the derivative of f. 

Confirm that (1 + ¢) is the code of all polynomials in B, of even parity. 
Show that the ideals of B, form a chain if and only if n = 2* for some k > 1. 
Draw the lattice diagram of all codes in Bg. 

(a) Find all generator polynomials for the code C = (1+¢) in By. 

(b) Repeat (a) in Bs. [Hint: Exercise 10 and Theorems 1 and 2.] 

Let C = (1+t) and D=(1+t+---+#"") in By. 

(a) If n is odd, show that CM D = {0} and B, &C x D as rings. 

(b) If n is even, show that DCC. 

7. How many cyclic codes are there of length 
(a) 7 (b) 6 (c) 8 
(d) 12 {e) 10 

8. Show that every cyclic code C in B, has the form C = ann f(t) for some divisor 
f#iof1—z" in Zola}. 

9. Let E be a finite field containing Z. and assume that u is a primitive element for 
E (that is, E* = (u); see Theorem 7 §6.4). If m; is the minimal polynomial of u’, 
show that m; divides <” — 1, where n = |E| —1. 

10. (a) Show that 1+2+2? +23 +274 is irreducible in Z[z]. 
(b) Show that 1+a+2?+23+2a4+4+25+2% is not irreducible in Zo[z]. Compare 
with Example 13 §4.2. 

11. Show that 1+ 2+ «* is irreducible in Zp[z]. 

12. Factor 1 — x° into irreducibles in Zg[z]. 

13. Show that 

Ll-a@=(l+a)\(l+24+2*)\(l+c+e%)(l+2%+2*)\1+et+2?4+ 254 24) 

is the factorization of 1 — x'> into irreducibles in Z.[z]. 

14. Find the generating polynomial for a BCH (31, 16)-code that corrects three errors. 
[Hint: Show that 2° + x? +1 is irreducible in Z[z] and use a root ¢ to construct 
GF (32). Show that 2° +2? +1, 2°+a4+a3+a7+1, anda°>+a4+2?+2+1 are 
the minimal polynomials of ¢, ¢*, and ¢°, respectively] 

15. Suppose that a cyclic code C in B, contains a word of odd parity. Show that 
L+i+i?+--- +2" is in C. 

16. If n is odd, show that x” — 1 is square free when factored into irreducibles in Zo[z]. 
[Hint: See the proof of Theorem 3 §6.4.] 

17. Assume that C = (g(€)) is a cyclic code in By. 
(a) If n is odd, show that C = (e(t)), where e(t)? = e(t) in B,. [Hint: By Exercise 
16, write ” —1= gh, where g and h are relatively prime in Z2[z]. By Theorem 10 
§4.2, write 1 = gg + ph in Zp[az] and take e = qg.] 
(b) Show that e(£) in (a) is uniquely determined by C, called the idempotent 
generator of C. 


ORE RS 


> 


18. 


19. 


20. 


6.7. An Application to Cyclic and BCH Codes 323 


(c) Find the idempotent generator for C = (1+ ¢+¢°) in By. 

(d) Find the idempotent generator for C = (1+) in B, (n odd). 

(e) If n = 2*, show that B, contains no idempotents except 0 and 1. [Hint: Exercise 
1(b).] Find an idempotent in Bg other than 0 or 1. 


Let C = (g(t)) in B,, where g divides - a . 
x” —1 in Zo[x] and deg g = m. Let | ww wa. 
U1, U2,‘** ,Um be the roots of g in A= 
some splitting field EB D Zo. ut unt o, scl 


(a) If the roots u,; are distinct, show that 
the matrix H is a parity check matrix for C (with entries in E). 

(b) If n is odd, show that the roots u; are necessarily distinct. 

(Requires elementary linear algebra). Let G be a generator matrix for an (n,k)-code 
C in B”. Carry G to row echelon form R by elementary row operations. Show that 
R has the block form R=[I, A], where A is a k x (n—k) matrix, and that R is 
also a generator matrix for C (and so is a standard generator matrix for C—see the 
discussion preceding Theorem 6 §2.11). 

Let g=gotgitte:+gni10%) and h=hothye+-::+Apiz”! in Zola]. 
Show that g(t)h(t)=0 in B” if and only if §=go091°--gn-1 is orthogonal to 
h = hn-1hn-2+++ho and to every cyclic shift of h. 


Chapter 7 


Modules over Principal Ideal 
Domains 


Algebra is generous, she often gives more than is asked of her. 
—Jean LeRond d’Alembert 


One of the goals of abstract algebra (and of other parts of mathematics for that 
matter) is to take a class of algebraic structures and show that each object in the 
class can be systematically constructed from simple and well-understood objects 
in the class. In this short chapter, we achieve this goal for the class of all finitely 
generated abelian groups: Each such group is isomorphic to the direct product 
of a finite number of cyclic groups. In fact, with little extra effort, we actually 
prove a more general version of this result which has far-reaching implications. 
This is achieved by introducing the concept of a module which, apart from its 
intrinsic interest, has become an indispensable tool in several areas of algebra and 
its applications. In the present case, the abelian groups turn out to be the modules 
over the ring Z of integers; our generalization is to look at modules over an arbitrary 
principal ideal domain. As a by-product, we obtain the classical description of the 
finitely generated abelian groups as direct products of cyclic groups. 


7.1. MODULES 


Much of what we say about modules is motivated by abelian groups. It is customary 
to write abelian groups additively, and we shall do so throughout this chapter. 
Hence, the unity is called zero and denoted 0, the inverse of an element zx is called 
the negative of x, denoted —z, and an exponent x” becomes nz for any integer n. 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


324 


7.1. Modules 325 


Thus, the equation z° = 1 in multiplicative notation becomes Oz = 0 in an additive 
group. The laws of exponents translate as follows: 


Multiplicative Notation Additive Notation 
g =9 lr =a 
gh t™ = g"gm (n+ mz = na +me 
-  (g™)" =g™ n(mz) = (nm)zx 
(gh)” =g"h” (when gh = hg) ~ n(x +y) = na +ny 


In their additive form these rules are exactly the axioms for scalar multiplication in 
a vector space (see Section 6.1), except that here m and n are restricted to come 
from the ring Z. The definition of a module unifies these two cases. 

Let M denote an abelian group (written additively). If R is any ring, we say 
that M is a left R-module if, for any r € R and x € M, an element ra € M is 
defined such that the following conditions hold for all z,y € M andallr,seE R: 


M1 r(a@t+y)=re+ry 

M2 (r+s)x=re+sz 

M3 r(sx) = (rs)z 

M4 lz=¢ 
Using Ml and M2, we have 02 =0 and r0=0 for all re@€ M and reR 
(Exercise 1). 

The multiplication rz where r € R and x € M is called the (left) action of R 

on the abelian group M. We will write pM to indicate that M is a left R-module. 


Similarly, M is called a right R-module (denoted Mp) if ar € M for all « € M and 
r € R, and the left-right analogues of M1—M4 hold. 


Example 1. The Z-modules are just the (additive) abelian groups. 
If R = F is a field, the F-modules are the vector spaces. 


Example 2. A ring R becomes an R-module rR, where the action is the ring 
multiplication. 


Example 3. Let M,,Mo,...,M, be R-modules, n > 1. The set of n-tuples 
M,®M2@-::®M, = {(21,22,...,%n) | 1 € M; for each i} 
is an R-module using componentwise addition and R-action: 
(21, %2,. sep) + (Yi, Y2)+++3Yn) oi (x4 + Y1, 22+ Yo,.--,2n + Yn) 
(Gj Bey en Pn) = CE oh as cng Ty) 


for all r € R. This module is called the (external) direct sum of the M;. If M is 
any module, we will write the direct sum of n copies of M as 


M"=MOMe::-OM. 


Many of the notions we have been exploring for groups and vector spaces have 
natural analogues for R-modules. We briefly discuss these one by one. 


1. Homomorphisms. Every group homomorphism a preserves exponents in the 
sense that a(g”) = a(g)” for all integers n and all g in the domain of a. In additive 


326 7. Modules over Principal Ideal Domains 


language this is a(nz) = na(z). Accordingly, if RM and RN are R-modules, we say 
that a map a: M — N is an R-homomorphism (R-morphism for short) if: 


H1 a(x+y)=a(z)+aly), forallz,yeM 
H2 a(rz)=ra(x), forallxe M andreR 


We also say that a is R-linear in this case. Note that every R-morphism is a 
homomorphism of additive groups by H1. If R = Z is the ring of integers, then H2 
follows from H1, so the Z-morphisms are precisely the group homomorphisms. For 
a field R = F, the F-morphisms are just the linear transformations. 

If pM is any module, the trivial morphism a: pM — RN (where a(x) = 0 
for all € M) is always R-linear, as is the identity morphism 1), : M — M given 
by lar(x) = @ for each « € M. Given modules pM, rN, and pK, and R-morphisms 


M . N S K, the composite af : M — K is also R-linear as is easily verified. 

If an R-morphism is one-to-one and onto, it is called an R-isomorphism. Two 
modules pM and RN are called isomorphic (written M = N) if there exists an 
R-isomorphism o: M — N. In this case 0-1: N— M is also R-linear (verify) 
and so is also an R-isomorphism. The results in Theorem 3 §2.5 and its Corollary 
remain true for R-morphisms in general. 


2. Submodules. If pM is an R-module, a subset N C M will be called an 
R-submodule if the following conditions are satisfied: 


S1 N is a subgroup of the (additive, abelian) group M 
82 raisin N forallre RandzteN 


Thus, the Z-submodules of a group are just the subgroups and, if F is a field, the 
F-submodules of a vector space are the subspaces. If M is a module, both M and 
{0} are submodules of M; we call {0} the zero submodule and write {0} =0. 
Note that a submodule N of M is an R-module in its own right, that is MI-M4 
hold for N (verify). Moreover, submodules of N are already submodules of M. 


Example 4. The submodules of rR are called the left ideals of the ring R; they are 
the additive subgroups DL of R for which ra € DL for allr € R and « € L. Similarly 
the right ideals of R are the submodules of Rr. Thus, Lemma, 2 §3.3 asserts that 
the ideals of R are precisely the additive subgroups of R that are simultaneously 
right and left ideals. Note that in a commutative ring the sets of ideals, left ideals, 
and right ideals coincide. 


Example 5. Let rM be an R-module. If A is a left ideal of R and X is a nonempty 
subset of M, define AX to be the set of all finite sums of elements az,aEé A,xEX: 


AX = {a1%1 + Gq%2+-+++ ant, | n> 1, a; € A and a; € X}. 


Then AX is a submodule of pM as is easily verified. In particular, if « ¢ M 
then Re = {ra |r € R} is a submodule of M called the principal submodule 
generated by a. Note that if R = Z then Zz is the subgroup generated by a. 


3. Kernels and Images. Ifa: pM — RN is an R-linear mapping then a is a 
group homomorphism and so has a kernel and an image as defined in Section 2.5: 
(1) kera = {2 € M | a(x) = 0}. 
(2) ima = {a(z) | «Ee M}=a(M). 


7.1. Modules 327 


Moreover, these are submodules of M and N, respectively. In fact, $1 follows in 
both cases because a is a group homomorphism (Theorem 1 §2.10). To prove 
S2 for kera, let « € kera and observe that, if r¢ R then ra € kera because 
a(rz) = ra(xz) = r0 = 0. Similarly, if a(x) € a(M) then ra(z) = a(ra) € a(M) for 
all r € R, so $2 holds for a(M) too. 

Ifa: M—N is R-linear, it is clear that a is onto if and only if im a=N. 
Moreover, since @ is also Z-linear, Theorem 3 §2.10 shows that a is one-to-one if 
and only if ker a = 0. We will use these facts frequently below. 


4. Factor Modules. Let N be a submodule of pM. Then N is a subgroup of 
the (abelian) additive group M, so N is normal and the factor group M/N exists. 
Writing z + N for the additive coset generated by x, the factor group takes the form 
M/N = {x+N|a€M}. This becomes an R-module via the action 


r(a+N)=rcz+QN, for allr€ Randae M. 


To see that this is well defined, let s+ N=y+N in M/N; we must show that 
re+N=ry+WN for any re R. Since r+N=y+N, we have x—yeEN by 
Theorem 1 §2.6. Hence ra — ry = r(x — y) € N because N is a submodule, and so 
ra +N =ry+N (again by Theorem 1 §2.6). This is what we wanted. 

With this, it is a routine matter to verify M1-M4 for the factor group M/N. 
Thus M/N is an R-module, called the factor module. Moreover, the coset map 


yp: M—>M/N given by o(z) = 2+ N for allcne M 


is an onto R-morphism with ker = N. As an illustration, the map k + k from 
Z — Zy is a Z-morphism so ak = ak for all a,k € Z. 


If a: RM —- RN is R-linear then Theorem 4 §2.10 shows that there exists 
a Z-isomorphism o : M/kera—a(M) given by o(a+kera) =a(z) for all 
in M. This map o actually satisfies H2 (verify), so o is R-linear and hence is an 
R-isomorphism. This proves 


Theorem 1. Module Isomorphism Theorem. If a: RM — RN is R-linear 
then M/kera = a(M). 


As one illustration, let p44 be a module, let x € M, and consider the map 
6:R-—+ Rx defined by O(r) =ra for all r¢ R. This is R-linear and onto, and 
ker 0 = {r € R| rx = 0}. This is a left ideal of R called the annihilator of z, and 
denoted annz = {r € R| rz = 0}. Hence, the isomorphism theorem gives 


Corollary 1. If M = RM is a module and x € M, then Rx = R/annz. 


An element « in a module pM in called torsion-free if annz = 0, that is if 
rz =0, r € R, implies that r= 0. Hence Ra = PR in this case. Clearly 0 is never 
torsion free; while in a vector space every nonzero element is torsion free. In an 
abelian group the torsion free elements.are precisely the elements of infinite order. 

Accordingly, we say that x is a torsion element if annz #0. Hence, in an 
abelian group the torsion elements are just the elements of finite order. We say 
that pM is a torsion module (a torsion-free module) if every nonzero element 
is torsion (torsion free). We will have more to say about these modules below. 


328 7. Modules over Principal Ideal Domains 


If K and N are submodules of a module M, then K 1 N is also a submodule as 
is K+N={k+n|keK, n€ N}. The proof of the following consequences of 
Theorem 1 is Exercise 7. , 


Corollary 2. Let K and N be submodules of a module M. Prove 


(1) Second Isomorphism Theorem. (K + N)/K = N/(KON). 
(2) Third Isomorphism Theorem. If K C N then M/N = (M/K)/(N/K). 


One very useful method of describing a module M is to show that it is an external 
direct sum of well-understood modules. There is a way to do this “internally”, 
working only with submodules of M, that will be used extensively below. 


Direct Sums 


If Hi,...,Hm are submodules of a module pM their sum H, + ---+ H,, is defined 
to be the set of all finite sums of elements of the H;, more formally 


Ay t::++ Hm = {ti t+-++++2m|m>1, 2; € H; for each 1}. 


This is a submodule of M. In fact it is the smallest submodule containing every H; 
in the sense that any submodule containing all the H; must contain Hy, +---+ Hm. 
For example, if a and b are integers, the set X = {ra + sb| r,s € Z} of all linear 
combinations of a and b played a prominent role in the study of Z in Section 1.2. 
Now we see that X = Ra+ Rb is the sum of the ideals of Z generated by a and b. 
Before proceeding, we note an important property of sums of submodules. 


Theorem 2. Modular Law. Let K, N, and H be submodules of a module M. If 
KCN thn NO(K+H)=K+(NN 4). 


Proof. Clearly K + (NOH) C NM (K + #). Conversely, lettin € NN (K + H), say 
n =k-+h with the obvious notation. Thenh=n—ké€ NOH because K C N, so 
n=k+heK+(NN 4). This proves that NO(K+H)CK+(NN 4). | 


Let Hy,...,Hm be submodules of some module. Clearly every element of the 
sum H, +-:-+ Hm is asum of elements of the H;, but in general this can happen in 
more than one way. We say that H; +---+ Hy, is a direct sum if this representation 
is unique for each x € Hy +---+ Hy; that is, given two representations, 


Epp t+ lm = LH Yr be + Ym 

where «; € H; and y; € H; for each i, then necessarily x; = y; for each #. 
Theorem 3. The following conditions are equivalent for a set Hy,..., Hm of sub- 
modules of a module M: 

(1) The sum Hy +--++ Hy, is direct. 

(2) Ifa, +-+-+-+2m =0 where x; € H; for each t, then each x; = 0. 

(3) (O;44H;) OH, = 0 for each k. 

(4) (Ay +++) + Hpi) 0 Ay = 0 for each k > 2. 
In this case Hy +---+ Hy & Ay @+++ @ Hm—the external direct sum. 


Proof. (1)=(2). We have 21 +++: + 2m =0=0+---+0, so each ; = 0 by (1). 


7.1. Modules 329 


(2)=>(8). Let 2 € (Lj, 44Hi)N Hy, so that c= DU; 4,0; with x; € H; for each 
t#k. Then 21 + +++ + tp + (—%) + Sep1 te + tm =0=04---+04---40. 
Hence —x = 0 by (2), so x = 0 proving (38). 

(3)=>(4). This is clear since Hy +--+ + Hy_-1 C Si 44H; for each k > 2. 

(4)>(1). Let v1 +:+:+ 2m = y1 +:+++ Ym be two representations of an element 
in Hy +---+ Hy, where 2;, y; € Hj for each i. Then (x1 — y1) +++ ++ (fm — Ym) =0 
and a; — y; € H; for each 7. Then @m ~ Ym € (DiemHi) N Hm = 0 by (4), 80 2m =Ym- 
Continuing in this way, (4) implies that x; = y; for each i < m. 


Finally, define 0: H; @:::@ Hm > Hy +--+ +Hm by 
O(21,°°* 2m) = 21 +++++2m, where x; € H; for each i. 
Then o is R-linear and onto, and it is one-to-one by (2). So o is an isomorphism. 
Corollary 1. If K,N C M then K + N is direct if and only if KN N =0. 
Hence if M = K @N then M/K = N and M/N& K (Exercise 9). 
Example 6. Let m = pq in Z, where gcd (p,q) = 1. Show that Z, = Zp ® ZG. 


Solution. If k € Z, let k € Zm denote the residue class. Let 1 = ap+yq, 2,y €Z 
(as gced(p,q)=1). Then k=kap + kyq = kap t+ ky@ for all k € Z, so Zm =Zp + ZG. 

To see that Zpn Zg = {0}, let k € ZDN ZG, say k = ap = bg where a,be Z. 
Then ap = bq so m|(ap — bg). Since g|m it follows that glap. But then gla (again 
because ged(p, q) = 1), say a = zq with z € Z. Hence, k = ap = zGp = zm = 20 = 0. 
This shows that ZpN Zg = {0}, as required by Corollary 1 of Theorem 3. 0 


Example 7. If e? =e € R is an idempotent, show that R = Re ® R(1—e). 


Solution. We have R = Re + R(1 — e) because 1€ Re + R(1 — e) and Re + R(1—e) 
is a left ideal. Now suppose that a € ReM R(1~e), saya =re=s(1—e),7,5E R. 
Then ae = re? = [s(1 — e)]e = s(e — e”) = 0 because e = e?. It follows that a = 0, 
which proves that ReN R(1 — e) =0, Hence, R = Re @ R(1 — e) by Corollary 1 of 
Theorem 8, as required. O 


Note that the converse of Example 7 is also true—see Exercise 15. 

Let Hy, He,:--, Hm be submodules of a module M such that H; + Ho +:+:-+ Hm 
is a direct sum. In view of the isomorphism in the last sentence of Theorem 3, it is 
customary to abuse the language somewhat and write 


Ay, + Ho +::-+Hm = Hi @ He O--- OH. 


We refer to this as the internal direct sum of the H;. The fact that we use the 
same notation for internal and external direct sums causes little confusion: It is 
nearly always clear from the context which one is intended. 


Corollary 2. Let M=H,®H2®---@®H,», be a direct sum of modules. If 
M=K,4+ Ki, 4+-+:-+ Km and K; C H; are submodules, then K; = H; for each 1. 


Proof. Let x; € H; for each 7 and write x = 71 + %2+-+--+2m. By hypothesis, let 
e2=ky+ko+---+km, where k; € K; C H; for each i. Because H; @-::® Hm is 
direct, k; = x; for each 7 by Theorem 3, so x; € K;. Hence, H; C K; for each i. B 


If M = H, ® Ho @:-:- @ A» is an internal direct sum of modules, it is impossible 
to overemphasize the importance of the uniqueness of the representation of each 


330 7. Modules over Principal Ideal Domains 


element x € M in the form x =hi tho +:::+hm with h; € H; for each i. This 
is evident in the proof of Corollary 2. As erie illustration, if K; C A; is a 
submodule for each 7, consider the map 


given by p(hy +hot-+++hm) = (hi + Ki, ho + Ko,..., hm + Km). Then ¢ is well 
defined because of the uniqueness. Hence y is an onto module homomorphism, as 
is easily verified, and ker p = K, @ Ko ®-::: @ K,,, (this sum is direct because sums 
from the K; are sums from the H;). The isomorphism theorem now gives 


Corollary 3. Let M = H, ® H2 @---® Hm, and let K; C H; be a submodule for 


each i. Then K=K, @K2@:::-@OK~m is a direct sum and Mos = Bgl. om. 


The following result will be used several times below. The proof is Exercise 12. 


Corollary 4. Suppose M = H, ®@ H2@-:-® Hp andN.=K,@K20:::'@Kym. If 
A; = K; for eachi , then M&S N. 


Free Modules 


Up to this point the concepts we have discussed about R-modules are analogues of 
those for abelian groups (the Z-modules). Free modules reflect the vector spaces. 

Given elements 71,...,2, in a module pM, a sum riz, +--+ 4+7reer, Ti € R, 
is called a linear combination of the x;, with coefficients r;. The sum of the 
submodules Rx; is a submodule 


Ray t-:++ Rep = {rity +--+ +rety | ri € RB}, 


consisting of all such linear combinations. This is the smallest submodule of M 
that contains every z;, and so is called the submodule generated by the 2;. If 
M = Ra, 4+---+ Raz we say that {x1,...,2,} is a generating set for M. If M 
has a generating set of n elements we say that M is n-generated; M is said to be 
finitely generated if it is n-generated for some n. Note that the finitely generated 
vector spaces are just the finite dimensional ones. 

As for vector spaces, a set {x1,...,24} in pM is called independent if 


Tey tes: tree, =0,73 ER, implies that rr]; =--- =r, =), 


that is, if the only linear combination that vanishes is the trivial one with every 
coefficient zero. Pursuing the vector space analogy, a subset {x1,..., vx} is called 
a basis of M if it is independent and generates M. The following characterization 
of these bases will be needed. 


Theorem 4. The following conditions are equivalent for {v1,...,¢~} C pM: 
(1) {a1,...,2,} is a basis of M. 
(2) M = R21 @:--@® Ray and each 2; is torsion-free. 
Proof. (1)=>(2). Assume (1). Then M = Ra, +---+ Ra; because the 2; generate 


M. Tf ria, +--+ +rpzy = 0, 7; € R, then each r; = 0 by independence. So certainly 
7,2; = 0 for each i, which shows that Ra, +-:-+ Ra, isa direct sum. But ifra; = 0 


7,1. Modules 331 


for some i then 0%, +--+: +ra;+-+-+0z, =0, so r=0, again by independence. 
This shows that each x, is torsion free, and so proves (2). 

(2)=>(1). The x; generate M because M = Rx; ©::-@® Rxy. Suppose that 
ryt, +++: +7rpx, = 0,7; € KR. Then each r;xz; = 0 because Rx, @:--@ Razz is direct, 
so each r; = 0 because the 2; are torsion free. Hence, the z; are independent. 


A module that has a finite basis is called a free module.** Every finite di- 
mensional vector space is free (it has a finite basis by Theorem 6 §6.1); however, 
in general finitely generated modules need not contain a basis (for example, the 
Z-module Zp, has no basis because it has no torsion-free elements). In fact, if R 
is a domain every free module is torsion free (Exercise 25). On the other hand, 
for a PID we will show that the converse holds for finitely generated free modules 
(Corollary to Theorem 1 §7.2). The “finitely generated” condition is essential: The 
set Q of rational numbers is a torsion-free Z-module that is not free (Exercise 25). 


Example 8. R” is free with basis {(1,0,...,0), (0,1,...,0), (0,0,...,1)}, called 
the standard basis of R"—see Example 11 §6.1. 


As for vector spaces, R-morphisms pW — pM are easy to define if RW is free. 


Theorem 5. Let pW be free with a basis {wi,...,wn}. Given any module pM 
and arbitrary elements 71,...,%n, in M, define 


6:W—>M by O(Urjywi) = Urizi, 
where the r; are in R. Then @ is an R-linear map such that 6(w;) = 2; for each 1. 


Proof. The map @ is well defined because each element we W has the form 
w= Ur;w; (the w; generate W), and this representation is unique (the w; are 
independent). The rest is a routine verification. 


Every image of a finitely generated module is again finitely generated (verify). 
Corollary 1 below is a useful converse of this: If MW in Theorem 5 is generated by 
{a1,...,n}, then the map @ is onto and we obtain 


Corollary 1. Every n-generated module is an image of a free module with a basis 
of n elements. 


If the module M in Theorem 5 is itself free with basis {21,...,@n}, then @ is 
onto because the x; generate M, and @ is one-to-one because the 2; are independent. 
Hence @ is an isomorphism and this, together with Example 8, gives 


Corollary 2. Every free module with a basis of n elements is isomorphic to R”. 


Corollary 3. If P and Q are free with bases of m and n elements, respectively, 
then P ® Q is free with a basis of m + n elements. 


Proof. Regard P ® Q as an internal direct sum. If {1,...,¢%m} and {y1,.-., Yn} 
are bases of P and Q, respectively, then {x1,...,%m,Y1,--+, Yn} is a basis of PO®Q 
as the reader can verify. 


82There is a general definition of a free module. Let X be a possibly infinite set in a module M. 
Then X is called a basis of M if it is independent (each finite subset of X is independent) and 
generates M (every element of M is a linear combination of (finitely many) elements of X). Then 
a module is called free if it has a basis. Our concern is about finite bases, but much of what we 
do generalizes. 


332 7, Modules over Principal Ideal Domains 


If pW is free with basis {w1,..., wp}, another important special case of Theorem 
5 arises as follows: For each k € {1,2,...,n} define 


Tte:W—oR by m4 (Ur;wi) = rp. 


These are onto, R-linear maps for each k, called the projections associated with 
the basis {w1,...,Wn}. The following useful property is easily verified 


c= Une (x)r,, for all ze W. 


In contrast to Corollary 1, a free module W can only be the image of a module MW 
if M contains a direct summand isomorphic to W. A module P is called projective 
if it satisfies the following property: 


Ifa:M — P is R-linear and onto then M=kera@®P, 


for some submodule P; of M. Note that P, = P; indeed Pj =M/kera2a(M)=P. 
Projective modules are important in ring theory and homological algebra. 


Theorem 6. Every free module with a finite basis is projective. 


Proof. Let a: M—W be onto and R-linear. If {wi,...,wn} is a basis of W 
choose x; € M such that a(x;) = w; (a is onto). By Theorem 5, there exists 
6:W —M such that 6(w;) = x; for each 1. It follows that a[6(w,)] = a(vi) = w; 
for each i, and hence that a(@(w)) = w for every w € W (since the w; generate W). 
In other words, a6 = lw. With this we can show 


M = kera @® 0(W). 
We have alx — 6a(x)| = a(x) — afa(x) = 0 for all c € M, so M =kera+0(W) 
because x = [z — 8a(x)|+ Pa(x). But if y € keraM6(W) write y = 6(w), where 
weEW. Then w=1yw(w) =a0(w) =a(y)=0, so y=0(w) = 0(0) =0. This 
completes the proof. @ 


Up to this point the properties of modules we have derived are valid for all rings 
R, even noncommutative ones, but sometimes restrictions on R must be made. For 
example, the invariance theorem (‘Theorem 5 §6.1) shows that any two bases of a 
finite dimensional vector space have the same number of elements (the dimension), 
but this theorem fails for free modules in general. We say that the ring R has 
invariant basis number (IBN) if, whenever a free R-module has a finite basis, 
then any two bases have the same number of elements. In this language, the above 
discussion shows that all fields have IBN. The next theorem shows that it is enough 
that R has an image R/A that is a field. In fact it is enough that R/A has IBN for 
some ideal A of R. 

Suppose A is an ideal of R, and W = pW isamodule. Then AW is a submodule 
of W (Example 5) and we claim that W/AW is a module over the ring R/A with 
the action (r+ A)(w+ AW) =rw+ AW for all re R and w€ W. To see that 
this is well-defined, let r+A=s+Aandw+ AW =v+ AW; we must show that 
rw+ AW =sv+ AW. But rw — sv=r(w —v) + (r—s)ve AW because v—we AW 
and r—s € A. The module axioms are routine verifications. Now we can prove 


Theorem 7. Let R be aring that has an ideal A such that R/A has IBN. Then R 
has IBN. 


7.1. Modules 333 


Proof. If {w;|1<i<k} is a basis of pW, it is enough to show that the set 
{wi + AW |1<1<k} is a basis of p/4(W/AW)—then k is uniquely determined 
by hypothesis. Given w+ AW in W/AW, write w = D,r;w;, rj € R. Then 
w+AW = (Diriwi) +AW = Dari (wi + AW) = di (r; + A) (wi + AW), 

which shows that {wj+AW |1<i<k} generates W/AW. To see that it is 
independent, observe first that AW = @;Aw, (Exercise 21). With this, suppose that 
Li(r; + A)(wi + AW) =0, 75 € R. Then D,(r;w;) + AW = 0, so Uyr;w; € AW, say 
Lyrj;w; = L,a;w;, a; € A. As the w; are independent this implies that r; = a; € A 
for each i, so rj + A=0 for each i. This shows that {w;+ AW |1<i<k} is 
independent in p/a(W/AW), and so is a basis. This is what we wanted. a 


If R has IBN, the number of elements in any basis of a free module W is called 
the rank of W, and denoted rank W. Thus, rank and dimension are the same for 
vector spaces. The ring Z of integers has IBN because Z/ (p) = Z, is a field for 
every prime p. More generally, every PID R has IBN. Indeed: If R is a field there 
is nothing to do; otherwise R contains a prime element p (it is a UFD by Theorem 
2 §5.2) and R/ (p) is a field by Theorem 3 §5.2. Hence 


Corollary, Every PID has IBN. 


Remark. In fact every commutative ring has IBN, but the proof involves a set- 
theoretic result called Zorn’s lemma (see Appendix C). This guarantees the pres- 
ence of a maximal ideal A in any ring R (even noncommutative), and R/A is a field 
if R is commutative. 


Exercises 7.1 


Throughout these exercises R always denotes a ring. 


1. If M is an R-module, show that: 
(a) Ox = 0 for all x € M. 
(b) r0 = 0 for allr eR. 
(c) (—1)2 = —a for all a € M. 
2. Let K C M be modules. Show that 
(a) If M is n-generated, then every image of M is m-generated for m < n. 
(b) If M= K @N is finitely generated, so are K and N. 
(c) If both K and M/K are finitely generated, so is M. 

. If e?=e € R, show that Re is an ideal of R if and only if eR(1 — e) =0. 

4, Let R=FXEFxX..-, (F a field), a ring with componentwise operations, and let 
A= {(a,,@2,°+:) | @;=0 for all but finitely many i}. Show that A is an ideal of 
RA that is not finitely generated. 

5. Let A and B be left ideals of R, and let K and N be submodules of pM. 

(a) Show that AK is a submodule of M. 
(b) Show that ACK +N) = AK+ AN. 
(c) Show that (A+ B)K = AK+BK. 

6. Let A be an ideal of R, and let pM be a module. If both pA and M are finitely 
generated, show that AM is finitely generated. [Hint: Exercise 5.] 

7. Let K and N be submodules of a module M. 

(a) Show that (K +.N)/K =N/(KNN). 
(b) If K C N show that N/K is a submodule of M/K and (M/K)/(N/K) = M/K. 


oo 


334 


8. 


10. 


16. 


17. 


18. 


19. 


20. 
21. 


22, 


23. 


24, 


7. Modules over Principal Ideal Domains 


Let R be an integral domain. Given pM let T(M) = {t € M | t is torsion}. 
(a) Show that T'(M) is a submodule of M—called the torsion submodule. 
(b) Show that T[M/T(M)| = 0. We say that M/T(M) is torsion-free. 


. If M = P@Q are modules, show that M/P = Q and M/Q & P. [Hint: Theorem 3(2).] 


Let K C N be submodules of amodule Mi. If KNX =NNX andK+X=N+4+X 
for some submodule X. Show that K = N. [Hint: N=NN(N+X).] 


. Let M=Z@Z, and K = {(k,k) | k € Z}. Determine if M = K@®X in case 


(a) X ={(k,0) | ke Z} (b) X = {(0,k) |k EZ} 
(c) X = {(2k, 3k) | k € Z} (d) What if X = {(k, —k) | k € Z}? 


. Tf M2 N;,1<1<k, show that M,@M2@:--O@M,=N,ONo®::- ONg. 
. Let M=M, 6 M2@:::@M, be an internal direct sum of modules. If we have 


M, = K, ®:::@ Kg, show that M = K, ®::-@K, ® M2. @:::@M,. 


. Let M=M, ®@ M2 @---®M, be an internal direct sum of modules. If K;C M; for 


each i, show that K,+K2+-:-+K, is a direct sum. 


. If R= A@B where A and B are left ideals, show that there exists e?= e € R such 


that A= Re and B=R(i-e). (Hint: Let l=e+f, e€ A, fEB. If acd 
consider a~ ae =af € ANB|] 

Given pM, an R-linear map 7: M — M is called a projection if 7? = 7. 

(a) If w is a projection, show that M = 1(M) @kerz. 

(b) If M = N@K, find a projection 7 such that N = r(M) and K = kerz. 

Let MN % M be R-linear. If «8 = 1,,, show that N = 6(M) @ kera. 

If M = Zo @ Za, write K = Zo © 0, N =0@ Zag, and X =0@ {0,2}. Show that 

(a) K =X but M/K # M/X. 

(b) M/(K@®X)&M/N but KOX ¥N. 

If G is an abelian group and |G|= mn, where gcd(m, n) = 1, show that G= G, ® Gn, 
where G, = {g € G| kg = 0}. 

If R is a domain, show that every finitely generated free R-module is torsion free. 
Let A be an ideal of a ring R, let RW be a module, and consider the R/A- 
module W/AW with action (r+ A)(w+ AW) =rw+ AW as in the discussion 
preceding Theorem 7. If :a:W- pV is R-linear, define @: W/AW — V/AV by 
&(w + AW) = a(w) + AV for all w+ AW € W/AW. 

(a) Show that & is well defined. 

(b) Show that & is (R/A)-linear. 

A module gM is called simple if 0 and M are the only submodules. If R is a ring, 
show that pF is simple if and only if R is a division ring. 

If pM and RN are simple (preceding exercise), prove Schur’s Lemma: Ifa: M — N 
is R-linear, then either a = 0 or a is an isomorphism. 

Show that the following conditions on a finitely generated module P are equivalent: 


(1) P is projective. P 
(2) P is isomorphic to a direct summand of a free module. + 
(3) If a, 8 are R-linear and a is onto in the diagram, B 
then y exists such that ay = £. 
(4) Ifa: M — P is onto and R-linear, a 

M N 


there exists y: P — M such that ay = 1p. 
[Hint: For (2) = (3) assume that F = P@Q is free for some module Q. Define 
w:F — P by a(p+q) =p for all pe P and ge Q. If {z,,...x,} is a basis of F 
choose m;€ M such that a(m,) = Bx(a,;) for each i. By Theorem 5, there exists 
an R-homomorphism @: F — M such that @(z,) = m, for each 1.] 


7.2. Modules Over a PID 335 


25. Show that Q is a torsion-free Z-module that is not free. 
[Hint: Call an additive group @ divisible if, for all O<n€Z and all geQ, 
the equation nz =q has a solution « € Q. Show that Q is divisible, direct sum- 
mands of divisible groups are divisible, and Z is not divisible.] 


7.2; MODULES OVER A PID 


Unless otherwise noted, throughout this section R will denote a principal ideal 
domain (PID); that is R is an integral domain and every ideal A of R has the form 
A= Ra for some a € R.8° These rings are discussed in detail in Section 5.2. The 
main example is R = Z, in which case the modules are the abelian groups (written 
additively). We will state most of the theorems in this section in the case that R 
is a general PID, but the reader should keep the abelian group case in mind for 
motivation. 

The goal of this section is to completely describe the finitely generated modules 
over a PID, and the following theorem is fundamental. 


Theorem 1. If R is a PID, every finitely generated module pM has a decompo- 
sition as a direct sum of principal submodules Ra;, x; € M: 


M = Rr, 0 Rr. 8:::O RIp. 


Due to its difficulty, the proof of Theorem 1 (and of a uniqueness property) is left to 
the end of this section; we focus instead on how Theorem 1 is used to give explicit 
information about finitely generated modules. In particular, we obtain a complete 
description of all finitely generated abelian groups. 

It is routine to verify that if M is a free module over a domain with a finite 
basis then M is finitely generated and torsion free. If the ring is a PID the converse 
holds. Recall that if MW is a free module with a finite basis over a PID, the number 
of elements in the basis is uniquely determined (Corollary to Theorem 5 §7.1) called 
the rank of M. 


Theorem 2. If R is a PID then a module pM is free of finite rank if and only if 
it is finitely generated and torsion-free. 


Proof. If M is free of rank n, let {wi,...,wn} be any basis. Then M = URwj; so 
M is certainly finitely generated. If am =0 with O04 me M, write m= Lrj;w,, 
r; © R. Then 0 = aw = U(ar;)w; so each ar; = 0 because the w; are independent. 
But some r; #0 (because w # 0), it follows that a = 0 (because FR is a domain). 
Hence M is torsion free. 

Conversely, if M is finitely generated and torsion-free, then Theorem 1 shows 
that M = Rx, ® Rtg ®-:-@ Raxp. We may assume that x; #0 for all 7. Hence, 
each 2; is torsion free because M is torsion free, so {r1, Z2,...%n} is a basis of M 
by Theorem 4 §7.1. | 


The principal submodules Rx arising in Theorem 1 can be easily described in 
terms of the ring R. The map rt rz from R—- Re is R-linear and onto with 


®3We use the suggestive notation Ra rather than (a) for the ideal generated by a € R. 


336 7. Modules over Principal Ideal Domains 


kernel annz = {r € R| rz = 0}—called the annihilator of z. Hence, Re = R/ann x 
by the isomorphism theorem (‘Theorem 1 §7.1). There are two cases: 

If x is torsion free, annz = 0 so Ra & R is free of rank 1. 

If x is torsion, annz = Rd for some 0#d¢€ R (Risa PID ) so Re = R/Rad. 
In the case that x is torsion, it is instructive to look at the case when R = Z. Then 
the torsion elements z are just those of finite order. In this case, the order o(x) =n 


if and only if annz = Zn, and this gives a way to extend the notion of order to 
torsion elements of any module over an arbitrary PID. 


Torsion Submodule 


Let R be a PID and let pM be a module. If x € M is a torsion element then ann x 
is a nonzero ideal of R so, as R is a PID, annz = Rd for some 0 4d € R. We define 
the order o(z) of a torsion element z as follows: 


o(z) =d, where annz = Rd.*4 


Of course d = o(z) is only unique up to multiplication by a unit of R by Theorem 
1 §5.1. However, we do have the following properties familiar from the group case: 
If of) =d #0 then 


(1) dx =0. 
(2) Ifr € R, then rz = 0 if and only if djr. 


The routine verifications are left to the reader. 

If R= Z, let x be a torsion element in some group. If annz = Zd, d # 0, then 
also anna = Z(—d). However, these are the only generators of ann (since 1 and 
—1 are the only units in Z). Hence, the convention in group theory is to make the 
order of x unique by choosing the positive generator for ann. 

If Ris a PID and M is an R-module, the set of all torsion elements of M isa 
submodule of M, called the torsion submodule of M/, denoted 


T(M) = {a € M | anna + 0} = {2 € M | o(x) 4 O}. 
Hence M is torsion free if and only if T(M) =0, and M is torsion if and only if 
T(M) = M. Thus finite abelian groups are torsion as Z-modules. 


Example 1. Let C* be the (multiplicative) group of all nonzero complex numbers. 
Then T(C*) consists of all roots of unity. Note that T(C*) is an example of a torsion 
group that is not finite. 


Theorem 8. Let M be a finitely generated module over a PID. Then 
(1) T(M) is a torsion submodule of M and M/T(M) is torsion free. 
(2) M=T(M) @W, where W is free of finite rank. 


Proof. For convenience write T(M) = T. 

(1) To see that M/T is torsion-free, let x + T #0 in M/T; we must show that 
ann(z + T) = 0. If not, let r(z +7) =0, where r #0. Then rz € T, say s(rz) = 0 
for some s #0 in R. But sr #0 because R is a domain, so this implies that x € T, 
contradicting the assumption that 7+T 4 0. 


34Warning. This notion of order does not coincide with the group-theoretic notion for torsion-free 
elements. In a group, x has infinite order if and only if ann(z) = 0. 


7.2. Modules Over a PID 337 


(2) The coset map y: M — M/T is onto and kerry =T. Moreover, M/T is 
finitely generated (since M is) and torsion free (by (1)), and so is free of finite rank 
by Theorem 2. But then M =kery @ W for some submodule W C M by Theorem 
6 §7.1. Since W & M/ker py = M/T, this proves (2). | 


If R is a PID, the finitely generated free modules are well understood (they 
are all isomorphic to R” where n is the rank). Thus, Theorem 3 shows that the 
task of describing all finitely generated modules is reduced to looking at the torsion 
modules. We now turn to this task. 


Primary Decomposition 


Let R be a PID, and recall that p € R is called a prime if, whenever p|ab in R then 
either pla or pid. If p is a prime and pM is an R-module define 


M(p) = {x € M | p*x =0 for some integer k > 0}. 


One verifies that this is a submodule of M for any prime p, called the p-primary 
component of M. Note that in a PID the only divisors of p* (up to unit multiples) 
are powers of p. Hence, 


M(p) = {x € M | o(x) = p* for some integer k > O}. 
Example 2. Write Zo4 = {0,1,2,...,23}. Then one verifies that 
Zoa(2) = Zoa3 = {0,3,6,9, 12,15, 18,21} and Zoq(3) = Zo48 = {0,8, 16}. 
We have Zo4(p) = {0} for all primes p other than 2 or 3. 


If M is torsion free then M(p) = 0 for all primes p. However, if M is torsion 
and finitely generated, this is far from the case. Indeed, let M = Ra, +:+--+ Rak, 
where R is a PID. If o(x;) = d; # 0 for each i then d = d,d2---d, #0 and dM =0, 
where we write rM = {ra | « € M}. Hence, ann(M) # 0, where 


ann(M) = {re R|rM =0} 


is called the annihilator of M. This is an ideal of R, and M is torsion if and only 
if ann(M) # 0. 


Theorem 4. Primary Decomposition Theorem. Let R be a PID, let RM #0 
be a finitely generated, torsion module, and suppose dM =0 where 0# de R. 
Since R is a UFD, let d= pe pr? --.pkm, where the p; are nonassociated primes in 
R and each k; > 0. Then 

(1) M = M(pi) ® M(p2) ®---® M(pm). 

(2) Ifann(M) = Rd then M(p;) # 0 for each 1. 


Proof. (1) We begin by showing that M = M(p) + M(p2) + +--+ M(pm). Tim =1 
then M(p) = M because dM = 0. So assume that m > 2. 

In this case write d; = d/ py for each i, and let c = gcd(dj, do,...,dm). We claim 
that no prime p divides c (if ple then p|d; for each i, and this is impossible because 
m > 2). Since R is a UFD, this means that c is a unit in R. But Theorem 1 85.2 
shows that c € Rd|+Rdo4+:::+Rdm, and it follows that 1=rjd,)+redg+---+1Tmdm 


338 7, Modules over Principal Ideal Domains 


for some r; € R. Hence, if c € M we have 
e=7ydye+rodgtt:::+rmdmz, YER. 
Moreover, p*#(ridjx) = ri(pf'd;)x = ridx = 0, so ridjax € M(p;) for each i. Hence, 
M = M(pi) + M(p2) + ++: + M(pm). 


To see that this sum is direct, let e€[M(p1) + M(p2) +++» + M(pe-1)] AM (px); 
we must show that x = 0 by Theorem 3 §7.1. Suppose 21 + 2%g +++: +2,-1 = & with 
a; € M(p;) for each i = 1,2,...,k —1 and z € M(px), say p;'x; = 0 for each 7, and 
pe = 0. Write q= pi: pie, Then gx; = 0 for each i < k. But ged(q, pi") =1 
because the p; are nonassociated, say 1 = rq+ spi where r,s € R. Thus 


a=le=rgqzt+ spe =rqe+0=rgq(ay +22 +++: +2p-1) =0, 


as required. Hence, M = M(p,) ® M(p2) ®--: ® M(pm), which proves (1). 

(2) Assume that ann(M) = Rd. Each M(p;) is an image of M by (1), and so is 
finitely generated. Hence, there exists s; > 1 such that p;'M(p;) = 0. Now suppose 
that M(p.) =0. This is impossible if m= 1 (we are-assuming that M #0). If 
m > 2, and we write b = p5? --- p$™, it follows that b € ann(M) = Rd. But then d|b, 
a contradiction because p; does not divide b. So M(p:) #0. A similar argument 
shows that M(p;) # 0 for each 7. | 


Corollary 1. If R is a PID, let M and N be finitely generated, torsion R-modules. 
Then M & N if and only if M(p) = N(p) for all primes p € R. 


Proof. If M(p) & N(p) for each prime p, then M = N by Theorem 4. Conversely, 
if a: MN is R-linear then a[M(p)] C N(p), with equality if a is onto (verify). 
So, if a is an isomorphism then a: M(p) — N(p) is an isomorphism. | 


Example 3. Find the primary decomposition of the abelian group G = Zgo, and 
find a generator for each primary component. 


Solution. Write k=k in G. We have ann(G) = Z60, so d=60=2?-3-5 and 
the primary decomposition is Zgo = G(2) ® G(3) @ G(5). Moreover, we claim that 
G(2) = Z15, G(3) = Z20, and G(5) = Z12. 

We show that G(3) = Z20, and leave the (similar) verifications of the others to 
the reader. We have G(3) = {a | 3*a =0 for some k > 0}. Hence 20 € G(3), so 
Z20 © G(3). Conversely, if a € G(3) then 3a = 0 for some k, whence 60|3*a. Hence 
4|3*a so, since gcd(4,3*) = 1 we have 4|a. Similarly 5|a so, since ged(4,5) = 1, it 
follows that 4-5 = 20|a. Hence a = 20k for some k, so a € Z20. This proves that 
G(3) € Z20. Oo 


A finite abelian group G is a finitely generated, torsion Z-module. If p € Z is 
a prime, G is called a p-group if the order of every element of G is a power 
of p. Thus, the primary component G(p) = {x € G | o(z) = p* for some k > O} isa 
p-group for each prime p dividing |G. Moreover, G(p) contains every p-subgroup of 
G and so is the unique largest p-subgroup of G. Thus in Example 3, ann(Zgq) = 60 so 
Zeo = Ze0(2) ® Zeo(3) © Zeo(5). Observe that 60 has prime factorization 60=2?-3-5, 
and |Zgo(2)| = 2?, |Zeo(3)| = 3, and |Zeo(5)| = 5 by Example 2. This holds for any 
finite abelian group. 


7.2. Modules Over a PID 339 


Corollary 2. Primary Decomposition Theorem for Finite Abelian Groups. 
Let G be a finite abelian group of order |G| = p{p5? ---pr'™, where the p; are dis- 
tinct primes. Then 


G = G(p1) ® G(p2) ® ++» ® G(Pm). 
Moreover, |G(p;)| = p;* for each 1. 


Surprisingly, the proof of the last statement in Corollary 2 requires the following 
important fact. 


Lemma 1. Let G be a finite abelian group, and let p € Z be a prime. 
(1) If p divides |G|, then G has an element of order p.*® 
(2) G is a p-group if and only if |G| = p”™ for some n > 0. 


Proof. (1) Use induction on |G|. It is clear if |G| = 1,2, or 3. If |G| > 3, choose 
some h € G, h #0, and write o(h) = n. If pln, then o($h) = p and we are done. So 
assume that gcd(p,n) = 1. If we write H = (h) , then |G| = |H| |G/H| =n|G/H|, 
so p divides |G'/H|. By induction, let g + H be a coset in G/H of order p. We claim ° 
that o(ng) = p. We have p(g +H) =0, so pg € H. Because |H| =n, this gives 
0 = n(pg) = p(ng) by Lagrange’s theorem. As p is a prime, it remains to show that 
ng # 0. But ng = 0 implies that n(g-+ H) = 0 in G/H and so, because g + H has 
order p, it yields p|n, contrary to assumption. 

(2) If G is a p-group then (1) shows that p is the only prime divisor of |G], and 
so |G| = p” for some n > 0. The converse is by Lagrange’s theorem. | 


Proof of Corollary 2. Write d = pi?" py? ---p?™. Then dM = 0 by Lagrange’s theo- 
rem so Theorem 4 shows that G = G(p1) ® G(p2) ®--: 8 G(pm). Clearly G(p;) is 
a p;-group for each 2, so let |G(p;)| = pr for some k; by Lemma 1. But then 


|G] = |G(r1)| 1G (p2)| + |G (x) = PE py? PEE 
Since |G| = d = p}' ps? ---prm, it follows that k; =n; for each i by the uniqueness 
of the prime factorization of integers. a 


p-Modules 


The primary decomposition theorem shows that to describe the torsion modules M 
over a PID R it is enough to describe M(p) for each prime p € R. To this end, 
an R-module M is called a p-module if, for each « € M, p*x = 0 for some k > 0, 
equivalently if o(a) = p” for some integer n > 0. Hence, if p € Z is a prime then the 
p-modules are just the p-groups defined above. Note that M(p) is a p-module for 
any M, and that images and submodules of p-modules are again p-modules. The 
following theorem gives a concise description of all p-modules over any PID. 

We will need one fact: If p is a prime in a PID R, then the factor ring R/Rp is 
a field. This is part of Theorem 3 §5.2, and actually characterizes the primes in R. 


Theorem 5. Let R be a PID, let p€ R be a prime, and let M be a finitely 
generated, nonzero p-module over R. Then there is a decomposition 


M=R2, 0 Re. @::-@ Rai, (*) 


85This actually holds for any finite group, abelian or not, and is called Cauchy’s theorem. We 
prove it in Section 8.2. 


340 7. Modules over Principal Ideal Domains 


where o(z;)=p™ with my >m2.>-+-:>m>1. Furthermore, the integers 
t,™m1,™M2,°+* ,M, are uniquely determined by M. 

More generally, if K C M is any submodule and K = Ry; ® Ry2 ®::: @ Ryy 
where o(y;) = p with ky > ke >+:+>ky >1, then u<t and ki <m, for each 
4S, Discn.05 Ue 
Proof. The decomposition in (*) exists by Theorem 1, each o({x;) is a power of p 
because M is a p-module, and we can ensure that m; > m2 >--: > mz by relabeling 
the x;. The uniqueness of ¢ and the m, follows from the last sentence of the theorem 
with K = M. 

So let K C M and, since K is also a p-module, use (*) to write K as in the 
theorem. We begin by showing that u < t. Define a submodule L,(M) of M by 


Lyp(M) = {x € M | px = 0}. 


This is a vector space over the field R/Rp via the action (r+ Rp)x = rz for all 
r € R. Moreover, a routine computation shows that L,(Rx;) = R(p™~1,;) for each 
i, so Ly(Ra;) has dimension 1 over R/Rp. But 


Ly(M) = Lp(R21) ® Ly(Rr2) ®--- ® Ly(Rez), 


so Lp(M) has R/Rp-dimension t. Hence, u < t because Lp(K) C Lp(M). 

The rest is proved by induction on n > 0, where p"M = 0 (such an n exists 
because M is a finitely generated p-module). Now m; < nas p” € ann Ra; = Rp™, 
and similarly k; <n. Ifn =0 then M =0 and there is nothing to prove. Ifn = 1 
then kj =m; =1 for each i=1,2,---,u. If m > 2 consider pM = {pr| x € M}. 
This is an R-submodule of M and one verifies that 

pM = Rpz, ®::-@ Rpxy, where m) > 1 and mi =-::- =m =1, 


pK = Roy ®::-@ Rpy,, where k, > 1 and kyy1 =-+:=ky = 1. 


Now observe that o(px;) = p™7+ for 1<i<. and o(py;) = p*—1 for 1<i< yp. 
Since p”-1(pM) = 0, induction gives <2 and kj <m for 1<i<yp. But if 
w<ti<u then k; = 1<m,, which completes the proof. |_| 


Let M be a finitely generated, nonzero p-module and, as in Theorem 5, let 
M=Rz,0 Rr. @:: @ Rat, 


where o(x;) = p™ for each i and my > m2 >--- > m: > 1. Then 


The t-tuple (m1, ™m2,...,™+) is called the type of the module M; 


The elements p™, p™?, ..., p™+ are called the elementary divisors of M. 


The integers m; and the elementary divisors p™ are uniquely determined by M. 
Given a sequence of integers my > m2 >--- > mz the module 


R/Rp™ © R/Rp™ ©---@R/Rp™ 
is of type (m1, me2,...,™mz). Hence, Theorem 5 gives 


Corollary 1. Up to isomorphism, there is exactly one finitely generated p-module 
of each type. 


7.2. Modules Over a PID 341 


If K and M are two finitely generated p-modules of types (ky, ke,..., ky) and 
(m1, ™M2,..., mz), respectively, we say that K has smaller type than M ifu<t 
and ki < m, for each i= 1,2,...,u. 


Corollary 2. If M is a finitely generated, nonzero p-module, then 


(1) Every nonzero submodule of M has smaller type. 
(2) M has a submodule of each smaller type. 


Proof. (1) This is by Theorem 5. 

(2) Let M have type (m1, mo,...,me), say M = Rr, @:-: @ Raz, (zi) = p™ 
for each i. Suppose that (k1,ko,...,k,) is a smaller type. Then the submodule 
K = R(pf 21) ®--- ® R(py—*»24) is of type (kr, ka,..., by). a 


Theorem 9 §2.4 shows that, if G is a cyclic group of order n, then G has exactly 
one subgroup of order d for each divisor d of n. Corollary 2 shows that, for finite 
abelian p-groups, the subgroups, although not absolutely unique as in the cyclic 
case, are uniquely determined up to type. For example, if G=Z,@Zp, then 
Ki =Z, ® {0}, Ko = {0} @ Zp, and Kz = {(a,a) | a € Zp} all have type (1). 


Example 4. If G is a p-group of type (2,1,1), the possible types of nonzero 
subgroups of G are (2,1,1), (1,1,1), (2,1), (1,1), (2), and (1). O 


Corollary 3. Let G be a p-group with |G| = p”. If G has type (mi, me,..., m+), 
then n = my + mg +++: +m.26 


Proof. Let G= Zar, @ Zr. @:::@Zax,, where o(z;)=p™ for each i. Then 
|G| = |Zar1| |Zro|---|Zay| = p™ p™ ---p™, and the result follows. ol 


If G is a cyclic group then G = Z,, or G = Z according as |G| = m or |G| = oo. 
It is customary to use Z,, and Z as representatives of the cyclic groups. 


Example 5. Classify the abelian groups of order p*, where p is a prime. 


Solution. The various types are listed together with a representative group. 


‘Type Group 


1) Lye @ Ly ® Ly © Zp 
ee abe Zp © Ly © Ly ® Ly @ Ly 


If n > 1, all abelian groups of order p” can be described in the same way. O 


Theorems 4 and 5 provide a way to describe all finite abelian groups. This is 
demonstrated in the following two examples. 


®6Decompositions n =m, + m2 +--+: +m: with m, > m2 >-+->m, 21 are called partitions of 
the integer n and are important in number theory. 


342 7, Modules over Principal Ideal Domains 


Example 6. Describe the abelian groups of order p?q?, where p and q are distinct 
primes. 

Solution. If |G| = p?q°, then G = G(p) @ G(q), where |G(p)| = p? and |G(q)| = 
by the Corollary 2 of Theorem 4. Thus, the possible types for G(p) are (2) and (1, 1), 
whereas those for G(g) are (3), (2,1), and (1,1, 1). Hence, up to isomorphism, there 
are six abelian groups G of order pq°: 


Zipp @ Zg3 Ly @ Lp @ Zg3 
Zy2 @ Ze @ Zq Ly @ Zp (oP) Ziq? @ Lq 
Ze © Lg ® Lg ®Lq Lp © Zp ® Lg © Liq @ Dy oO 


’ Example 7. How many distinct abelian groups are there of order 1, 333, 584? 


Solution. Because 1,333,584 = 24 - 35-73, the primary components have orders 2+, 
3°, and 73. The various types are 


e 2-component (4), (3, 1), (2,2), (2,1, 1), (4,1, 1, 1) 
° 3-component (5), (4,1), (3, 2), (3, 1, 1), (2, 2,1), (2,1, 1,1), 4,1, 1,1, 1) 
e 7-component (3), (2,1), (1,1, 1) 


Thus, there are 5,7, and 3 choices, respectively, for the primary components and 
hence 5-7-3= 105 choices in all. Theorem 5 and Corollary 1 of Theorem 4 
guarantee that no two are isomorphic. O 


We have illustrated the general results about modules over a PID using abelian 
groups (Z-modules). However, there is another very important example. If F is a 
field, much of linear algebra is concerned with determining the nature of a linear 
transformation a: V— V where V is a vector space over F. Theorem 1 gives a 
satisfactory answer when V is finite dimensional. There are two key observations: 
First, if t is an indeterminant over F' then the polynomial ring F[t] is a PID by 
Theorem 1 §4.3, and each nonzero ideal has a unique monic generator. The second 
observation is that, given a linear transformation a: V > V, the vector space V 
becomes a module over the polynomial ring F'[t] via the action 


pu=p(a)(v), for allp € Ft] and allv € V, 


where we remember that p(a) : V — V is a linear transformation for each a. If pV 
is finite dimensional, one shows that »jV is torsion and finitely generated. Hence, 
the decomposition of pV in Theorems 4 and 5 provide an elegant way to prove 
many of the basic theorems of linear algebra. Moreover, if A is an n xX n matrix in 
M,,(F) then this yields the canonical forms for the matrix A by taking V = F” and 
considering the linear transformation a: V — V given by a(v) = Av for all uv € V. 
However, a discussion of the details is beyond the scope of this book. 


The Fundamental Theorem 


If R is a PID, we are going to prove Theorem 1 that every finitely generated torsion 
module over F is a direct sum of principal submodules in a unique way. The whole 
thing depends of a result about free modules called the submodule theorem. This 
in turn requires two preliminary results, each of interest in itself. 


7.2. Modules Over a PID 343 


If S is a nonempty set of submodules of a module M, a submodule K € S is 
called maximal in S if, whenever K C N with N € S, then necessarily K = N. 
For example, the maximal ideals of a ring R are the maximal members of the set 


S={A| A is an ideal and A # R}. 


Lemma 2. If R is a PID then every nonempty set S of ideals of R has a maximal 
member. 


Proof. Assume that S has no maximal member, and choose A; € S. Then A, 
is not maximal so let A; C Ag, where Az € S. But Ag is not maximal either, 
so let A, C Ap C Ag where Ag € S. This process continues to create a strictly 
increasing sequence A; C Ap C A3 C-+-C Ap C Agi C+ of ideals of R.8” 
Define A = A, U Ag U Ag U--- . Then A is an ideal so, since R is a PID, let A = Rd 
where dé R. Since d€ A, we have d€ A, for some k > 1, and it follows that 
AC Ap C Agyi C++: A. But this implies that A, = Apz41, a contradiction. | 


Lemma 3. Let RW be a free module of rank n over a PID R. If K #0 is a 
submodule of W then K is also free and rank(K) <n. 


Proof. Since rank W = n there is an isomorphism 0 : W > R”,so K 2 o(K) CR”. 
Hence, we may assume that K C R”, and we proceed by induction on n. Ifn =1 
then K C Rand the result follows from the fact that each nonzero ideal of R has the 
form Rd, where d is torsion free. If n > 2, define «: K - R by e(r1,...,fr) =" 
whenever (ri,...,7) € K. If e(K)=0 then K CR"! and we are done by 
induction, If e(K) #0 then, since it is an ideal of R, write <(K) = Ra where 
0#ae€R. Hence ¢: K — Ra is onto so, since Ra is free (a is torsion free), 
Theorem 7 §7.1 shows that K = kere @ K, where K, & Ra. Now kere C R™1, so 
ker € is free of rank at most n — 1 by induction. Since Ky, & Ra is free of rank 1, it 
follows that K = kere ® Ky is free of rank at most n. a 


Important as Lemma 3 is, we need more detailed information about how the 
free submodule K is positioned in W. This is provided by the following fundamental 
theorem. 


Theorem 6. Submodule Theorem. Let R be a PID, and let pW be a free module 
of finite rank n. If K #0 is a submodule of W, there exists a basis {y1, y2,-++;Yn} 
of W, an integer m <n, and nonzero elements dj, d2,...,dm of R such that 


(1) {diyi, doy2,:+: ,dmYm} is a basis of K. 

(2) dj|diz1 for each 1. 
In particular, K is free of rank m <n. 
Proof. Let {x1,22,...,n} be any basis of W, and let 7; : W — R be the projection 
for each i given by 7;(“7,2,) =1;. Define 

_8={a(K)|a:W — Ris R-linear}. 

Then S consists of ideals of R, and S is nonempty (it contains the zero ideal). 
Hence, Lemma 2 shows that S contains a maximal member A(K) where \: W > R 


87 Actually this requires a set-theoretical theorem called transfinite recursion. This is discussed in 
Appendix D. 


344 7. Modules over Principal Ideal Domains 


is R-linear. Observe that A(K) # 0 (otherwise 0 is maximal in S, so 7;(K) = 0 for 
each i, which implies that K = 0, contrary to assumption). Since R is a PID, let 


MK)=Rd, 04dER. 
Write d = \(z), where z € K. 
Claim 1. dla(z) for each R-linear a: W > R. 


Proof. Since R is a PID let e = gcd(d, a(z)). Then eld, ela(z), and e = rd+ sa(z) 
for some r,s € R. With this, define 


y¥:W->Rby yx) =rdA(x) + sa(z), for alla ec W. 


Then ¥ is R-linear and y(z) = rd + sa(z) = e. Hence, e € 7(K), so Re C 7(K). But 
Rd C Re (because e|d) so we have Rd C Re C 7(K), Since Rd = X(K) is maximal 
in S, it follows that Rd = Re = 7(K). In particular, d= ue for some unit u € R, 
so the fact that ela(z) implies that dla(z). This proves Claim 1. 


In particular, d|7;(z) for each ¢ by Claim 1, say 7;(z) = cid where c; € R. 
Define y = DfL,c;a;. Then dy = D2_,de;x; = Df, 7i(z)ai = z € K because the 
m; are the projections. Hence, d = \(z) = A(dy) = dX(y). Since R is an integral 
domain, this implies that 

A(y) = 1. 
Claim 2. (i) W = Ry @ker\, and (ii) K = Rdy @ (K Nker A). 


Proof. RyNker\ =0 because A(ry) =rA(y) =r for each r€ R. To see that 
W = Ry+kerA, let 2 € W, write it as e = (a) y + (2 — A(x)y), and observe that 
A(x — A(x)y) = A(z) — A(w)A(y) = 0 because A(y) = 1. This proves (i). 

For (ii), note first that RdyM (K Nker A) C RyNkerA = 0 by (i). To see that 
K = Rdy + (K ker A), let zo € K and observe that A(zo)y € A(K)y = Rdy CK. If 
we write rp as Zp = A(%o)y + (Lo — A(#o)y), then (ii) follows because A(xo)y € Rdy, 
Lo — A(xo)y € K, and A(xo — A(ao)y) = A(wo) — A(zo)A(y) = 0. This proves Claim 
2. 


We can now use these results to complete the proof of Theorem 6 by induction 
on n. If nm =1 then K C Raj, and we consider A= {a € R| ax, € K}. This is an 
ideal of R, say A = Rd,, d, € R. One verifies that K = Rd x1, so the bases {x} 
and {da} satisfy our requirements. 

Let n > 2. Then ker A is free by Lemma 3 and Claim 2(i); and ker \ has rank 
n — 1 by Corollary 3 of Theorem 5 §7.1 because Ry is free of rank 1 (y is torsion-free 
as \(y) = 1). Hence, by induction, there exists a basis {y2,...,Yn} of ker A, an 
integer m <n—1, and elements d2,...,dm in R such that djldj41 for 7 > 2 and 
{doy2,...;dmYm} is a basis of KM ker A. Hence, {y, y2,-..,Yn} is a basis of W by 
Claim 2(i), and {dy, day2,...,dmYm} is a basis of K by Claim 2(ii). 

Thus, it remains to show that d|dz. To this end, define py: W — R by taking 
p(y) = 1= v(ye) and y(y;) = 0 for i > 2. Then d = y(dy) € y(K), so Rd C y(K). 
Since Rd = A(K) is maximal in S, we obtain Rd=y(K). But then we obtain 
dz = y(day2) € y(K) = Rd, so d|dz as required. a 


With this we can prove the main theorem of this chapter. 


7.2. Modules Over a PID 345 


Theorem 7. Fundamental Theorem. Let R denote a PID. If pM is finitely 
generated, there exist integers m > 0, k > 0, and nonzero nonunits dj, d2,...,dm 
in R, such that d;|dj41 for each i and 


M = R/Rd; @ R/Rdp @::-@ R/Rdm © R*. 


Moreover, m € Z, k € Z, and the elements d; € R are uniquely determined by M. 


Proof. We split the proof into showing that such a decomposition exists, and then 
that m, k and the d; are unique. 


Existence. If pM is n-generated, let 6: W — M be an onto R-homomorphism 
where W is free of rank n. If we write ker 9 = K, the submodule theorem (Theorem 
6) provides a basis {y1, y2,...,; Yn} of W, an integer m <n, and nonzero nonunits 
dy, dz,...,dm of R such that d,|dj41 for each i and {dyy1, doye,...,dmYym} is a 
basis of K. Thus, 


W= Ry © Ry @::: ORym © Rym41 O:::O Ryn, 
Hence, the isomorphism theorem and Corollary 3 of Theorem 3 §7.1 give 


Ry Ryz a Rum 
Rdiyr ~— Rdzy2 RdmYm 


Since Ry;/Rdiy; = R/Rd; for each i, and Ry; = R for each j, this proves the 
existence part of Theorem 7. 


M2W/K & 


® Rym+i ® ++ ® Ryn 


Uniqueness. Each factor module R/Rd; is principal; in fact R/Rd; = R(1+ Rd;) 
and o(1+ Rd;) =d;. Since R= R1 is also principal, we obtain an (equivalent) 
internal direct sum decomposition of M : 


M = (Ra, @ Rr. @-::-@ Rey) O (Rw @-:- @ Ruz), 


where m > 0, k > 0, w; is torsion-free for each j, o(%;) = d; for each 7 = 1,2,...,m, 
and d,|d;41. Note that this proves Theorem 1 (with an additional uniqueness state- 
ment), and hence completes the proofs of Theorems 4 and 5. We use these results 
to prove uniqueness in the present theorem. 

For simplicity, we shall write T=T(M), X = Ra, @ Rr. ®:-:-@ Ray, and 
W = Ru, @-::@® Rwz, so that M = X @W. We have X C T because d,x; = 0 for 
each 7, and we claim that this is equality. If z€ T write z=2x-+w where x € X 
and wé€W. Then z—-~x2 =weETNW, so z—Z2 is both torsion and torsion free 
(W is free). Hence, z—~2x=0, so z=2€X and we have proved that TC X. 
This shows that X = 7, and hence that M =T @ W. But then W & M/T, and so 
k =rank(M/T) is uniquely determined by M. 

For the rest, let p1,p2,..., pe be the distinct primes dividing at least one of the 
elements d1,do,...,dm, and write 


d; = pr ps? ns pe, kaj > 0, (*) 


where some of the kj; may be zero. 


Claim. Ra; = Rr @ Rai O--:® Raz, where o(2;;) = Dp; for each j. 


346 7. Modules over Principal Ideal Domains 

Proof. Write di; = di |p; for each j, and define 2; = dj;x;. Then o(a,;) = py! 
because®’ 0(2;) = dispy" . Moreover, dj1, di2,..., dig are relatively prime (no prime 
divides all of them), say 1 = ridjj + rodig +---+redy for some r; € R. It follows 
that Re;= Rdjyx,; + Rdjx; +-+-+ Rdyzx;. This sum is direct by Theorem 4 because 
Ra; is contained in the p;-primary component of Ra; (in fact pyit (Raj) = 0). This 
proves the Claim. 


It follows that T' is the direct sum of all the modules Rz;;, and so its 
p;-primary component is the sum of all these summands that are p;-modules, that is 
T (pj) = R21; Rx; 0---O@Rxm;. Thus, the primes p; are uniquely determined by M 
(they are the primes p such that T(p) # 0). Moreover, the fact that d;|d,41 for each 
2 implies that ki; < key <-+-:< km; for each j. Eliminating zero values, this shows 
that (kmj,km-1;, km-2j,°+:) is the type of the pj-module T(p;). Hence, the k;; are 
uniquely determined by M. But then (*) shows that the elements d1,d2,...,dm 
(and hence m) are also uniquely determined. i 


In the proof we obtained a useful internal direct sum decomposition 


Corollary 1. If R is a PID and pM is finitely generated, there exist m > 0, k > 0 
in Z, and nonzero nonunits d,d2,...,dm in R, such that d;|dj,1 for each i and 


M = (R21 ® Rxa ®:-- ® Ram) 6 (Rui ® +: ® Rug), 
where w, is torsion free for each j and o(x;) = d; for each i. 


Note that this proves Theorem 1 (which we have used several times). 

The elements d; in Theorem 7 are called the invariant factors for the module 
M. In the notation of the proof, the elements py,’ are the elementary divisors of 
M(p;) for each j, and so are uniquely determined (up to unit multiples) by the 
module MM. They are called the elementary divisors of M. 


Specializing Corollary 1 to the case of Z-modules, we obtain the main motivating 
example for Theorem 6. We use the fact that every abelian group of order n is 
isomorphic to Zp. 


Corollary 2. Fundamental Theorem of Finitely Generated Abelian Groups. 
If G is a finitely generated abelian group then 


G2, 01,0: OZ, OZ, 


where t; > 0 and t,|tiz; for each i. The integers k,m,ti,...,tm are uniquely 
determined by G. 


The theorem that every finite abelian group is a direct sum of cyclic groups was 
first proved in 1870 by Leopold Kronecker. The uniqueness came later as did the 
extension to finitely generated groups. 

We can easily pass back and forth between the primary decomposition of a 
torsion module and the decomposition in the fundamental theorem in terms of the 
invariant factors. This is illustrated in Example 8 for a torsion abelian group. 


Example 8. Let G=Z15 © Zo0 @ Z5a9 so the invariant factors are 15,90, and 
540. Using the primary decomposition theorem on each of these summands, and 


88In general, if o(z) = ab # 0 then o(az) =b. 


7.2. Modules Over a PID 347 


rearranging the resulting summands, we obtain 


G = (Z3 © Zs) © (Zo © Zo @ Zs) © (Z4 © Zn7 @ Zs) 
= (Zz ® Z4) © (Zz © Zg ® Za7) @ (Zs © Zs @ Zs). 
Thus, the primary components (and hence the elementary divisors) are 
G(2) = Z4 @ Ze type (2,1), 
G(3) =Ze7@Zy @Z3 type (3, 2, 1), 
G(5)=Zs@®Zs5@Zs5 __ type (1, 1,1). 
On the other hand, given these primary components, we can retrieve the groups 
Zsa0, 290, and Zy5, respectively, as the sum of the summands in the primary com- 
ponents of largest order, second largest order, and so on 
Zs40 = Za ® Zor @ Zs, 
Zoo = Ze @Zy OZs, 
Zis =002Z3 025. 0 


Exercises 7.2 


Throughout these exercises R is a PID and modules are R-modules unless otherwise 
specified. 


1. Write down all the abelian groups (up to isomorphism) of each order: 


(a) 9 (b) 10 (c) 12 (d) 27 
(e) 30 (f) 60 (g) 108 
2. If p is a prime, determine all the abelian groups of order: 
(a) p* (b) p® 
3. If p # q are primes, determine all the abelian groups of order: 
(a) pq? (b) p?q? 


A. If p,q, and r are distinct primes, determine how many nonisomorphic abelian groups 
there are of order: 
(a) p?qer* (b) p®qr? 
5. List the types of all nonzero subgroups of G if G is a p-group of type (3, 2, 1). 
6. If G is an abelian group with |G| = 108, and G(2) and G(3) have types (2) and 
(2,1), respectively, how many nonisomorphic subgroups does G have? 
7. Find the type of the primary components of 
(a) G = Zy2 © Ze © Z75 (b) G = Zz6 @ Zaz © Zog 
8. Determine the abelian groups of order p” containing 
(a) an element of order p”~?, 
(b) an element of order p”~?. 
9. Determine the abelian groups of order p® containing 
(a) no element of order greater than p’, 
(b) no element of order p*. 
10. Are the groups Zs ® Zi9 @ Los © Zz6 @ Zs4 and Zsq @ Zing ® Zaso isomorphic? 
11. Let c €rM have o(x)= d # 0. If d = ab, show that o(ax) =b. 
12. Let K C M be modules. Show that - ; 
(a) T(K) = KNT(M). 
(b) If K C T(M), show that T(M/K) =T(M)/K. 
18. If K C M are modules, show that M is torsion if and only if both K and M/K are 
torsion. 


348 


14. 
15. 
16. 


17. 


18. 


19. 
20. 


21. 


22. 


7, Modules over Principal Ideal Domains 


Show that Q/Z is a torsion group that is not finite. 
If M = M, ®:::@®M,, are modules, show that T(M) =T(Mi) ®---®T(M,). 

Let K C M be finitely generated abelian groups. Show that 

(a) K/T(K) is isomorphic to a subgroup of M/T(M). 

(b) T(K) C T(M) and T(M)/T(K) is isomorphic to a subgroup of T(M/K). 

Let M be an abelian group and assume that M = H @ W, where H is torsion and 
W is torsion free. Show that H = T(M) and W = M/T(M). 
If a: M—N is a homomorphism of R- 


modules, show that a[7(M)] C T(N), and Mt NM 
that there is a unique homomorphism @ ‘ ol Q 
M/T(M) — N/T(N) satisfying ap = 0a, 

where py: M-—M/T(M) and 6:H- M/T(M) —*—+ N/T(N) 


N/T(N) are the coset maps. 

Describe T(C°) and T(Q’). 

Let de R. If N is any module define Lag(N) = {x € N | dx = 0}. 

(a) Show that La(NV) is a submodule of N. 

(b) If d #0, show that L4y(R/Rd*) = R(d**+Rd*) for all k > 1. 

(c) If M = @72,M;, show that La(M) = O72, La(M,). 

Let dé R. If N is any module define dN = {dz | x € N}. 

(a) Show that dN is a submodule of N. 

(b) If M = @%,M;, show that dM = @™,dMj. 

Given a prime p€R and a module RM, define L,(M)= {x € M | px =0} 

and pM = {px |z «€ M}. (These are submodules and preserve direct sums by the 

preceding exercises.) 

(a) If M = Ra where o(z)=p™, m>1, show that L,(M) = R(p™*+Rp*) and 
Rpt ifm>1 

Gs aa i ee 


(b) If M is a p-module of type (m1, m2,..., mz), show that L,(M) has type (1,1,..., 1) 


. with t ones. Also show that, if pM #0, it has type (m; —1,mz2—1,...,ms—1), 


23. 


24. 


25. 


26. 


27. 


28. 


29. 


where m, > 1 but msg = + =m =1. 

If p is a prime, determine the structure of a finite abelian group G if pg = 0 for all 
g EG. 

Show that every submodule of a finitely generated module over a PID is again finitely 
generated. [Hint: Submodule Theorem.] 

If G is abelian and n divides |G|, show that G has a subgroup of order n. Show that 
this conclusion fails if G is not abelian. 

Let G = Zr where o(a) =p”, p a prime. If k <n show that L,x(G) = Zp""*a, and 
hence that | Lpk (@)| = p*, 

Let G be a finite abelian p-group of type (mi, 2,..., Nr). 

(a) Show that |L,(G)| = p’, so there are p” — 1 elements of order p. 

(b) Let s be the number of integers ¢ such that n; > 2 (possibly s = 0). Show that 
|L,2(G)| = p’ts, and hence that G has p”(p* — 1) elements of order p’. 

If M@M=NBON, where M and N are finitely generated p-modules, show that 
MEN. 

If G@H=G@K, where G,H, and K are finite abelian p-groups, show that 
H = K—the cancellation property. [Hint: Reduce to the case where G is cyclic.] 


Chapter & 


p-Groups and the Sylow 
Theorems 


Mathematics is the tool specially suited for dealing with abstract concepts of any kind. 
There is no limit to its power in this field. 


—Paul Adrien Maurice Dirac 


Historically, the theory of groups was concerned only with groups of permutations 
of a set. This point of view is reinforced by Cayley’s theorem, which shows that 
every abstract group can be viewed as a subgroup of a group of permutations. The 
concept of an abstract group became important because it focuses attention on 
those aspects of a group of permutations that do not depend on the underlying set. 
However, this abstract formulation of the theory loses sight of the combinatorial 
aspects that are more in evidence for groups of permutations. And these counting 
methods give important information about abstract groups. The best example is 
Lagrange’s theorem, which is based on the fact that a subgroup partitions the group 
into cosets each having the same number of elements as the subgroup. 

In Section 8.2, we derive another such counting theorem, the class equation, 
from a partition of a finite group and use it, among other things, to deduce many 
properties of groups of prime power order. Then, in Section 8.3, we present a far- 
reaching counting method that includes the proof of Lagrange’s theorem and the 
class equation and which, in Section 8.4, we use to prove the Sylow theorems. These 
beautiful results guarantee the presence of subgroups of prime power order in every 
finite group and inform us about how many such subgroups there are. 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


349 


350 8. p-Groups and the Sylow Theorems 
8.1 PRODUCTS AND FACTORS 


An important goal of the theory of groups is to prove structure theorems, that 
is to describe all groups of a certain type in terms of particular constructions of 
well-known subgroups of that type. For example, in Section 7.2 we showed that 
every finite abelian group is a finite direct product of cyclic subgroups. We begin 
this section by describing when a group G is isomorphic to a finite direct product 
Gi x Gg x +++ X Gy of subgroups of G. These direct factors G'; are also images of 
G, and we gain some insight ‘as to how to describe these images in the second part 
of this section. 


Products of Subgroups 
If X and Y are nonempty subsets of a group, define their product as follows: 
AY ={ay|cveX andyecY}. 
This is an associative multiplication; indeed if Z is another nonempty subset then 
(XY)Z = {xyz|xv eX, ye Y andze Z}=X(YZ). 


as the reader can verify. Moreover, {1}X = X = X{1} for all nonempty sets X, 
so the set of all nonempty subsets of G is a monoid with unity {1}. Moreover, the 
map at> {a} is a group embedding (one-to-one homomorphism), so we identify G as 
a subgroup of this monoid by identifying a = {a} for all a € G. Hence, we write 
X {a} = Xa and {a}X = aX, which agrees with our earlier usage for cosets Ha 
and conjugates a~'Ha of a subgroup H. 

These products are most useful for subgroups. If H and K are subgroups of some 
group, the product HK is again a subgroup if and only if HK = KH (Lemma 2 
§2.8), and this holds if either H < G or K < G (where K <4 G means K is normal 
in G). The following result will be used several times in this chapter. 


Lemma 1. Modular law. If H, K, and M are subgroups of a group and H C M 
then H(KNM)=HKOM. 


Proof. The inclusion H(ANM) C HKMM is clear because H C M. Ife HKNM, 
say r= hk=m,thenk=h"meKNM,sor=hke H(KNM). D 


For reference, Theorem 6 §2.8 reads 
IfH<«G,K<4G,andHOK={i} thn HKPHxK, oe) 


The next theorem is an important extension of this which requires only that K <1 G. 
Recall that the isomorphism theorem for groups asserts that, ifa :G— H isa group 
homomorphism, then kera < G and G/kera = Ga. 


Theorem 1. Second Isomorphism Theorem. Let H and K be subgroups of a 
group G with K <G. Then HK is a subgroup of G, K < HK, HN K <1 H, and 
HK 
“K ~ HK’ 
Proof. HK is a subgroup by Lemma 2 §2.8, and K < HK because K C HK. Define 
a:H — HK/K by a(h) = Kh. (Note that Kh is in HK/K because H C HK.) 
‘Then a is a homomorphism, and it is onto because, if « €¢ HK = KH, say s = kh, 


8.1. Products and Factors 351 


then Ka = K(kh) = (Kk)h= Kh=a(h). Now the theorem follows from the 
isomorphism theorem because ker(a) = {he H| KR= K} =HNK. a 


If H and K are both finite subgroups, we saw in Corollary 1 to Theorem 6 §2.8 
that |HK| = |H||K| whenever HM K = {1}. Here is a useful generalization. 


Theorem 2. Let H and K be finite subgroups of a group. Then 
|HK| | | = || |K]. 


Proof. Write N = HK for convenience. Let Nky,...,Nkm be the distinct cosets 
of N in K, so that m= |K : N| is the index of N in K, 


Claim. HK = Hkyi U HkeU-:+ UHkm, a disjoint union. 


Proof. If k € K, then k = nk; for some n € N. If h € H then hk = (hn)k; € Hk, 

which proves that HK CHk,U---UHk,. Thus, HK = Hk,U---UHk,,. To see that 

this is a disjoint union, suppose that Hk; 1 Hk; is nonempty. Then Hk; = Hk;, so 

ky ky le HNK=N. Hence, Ni = Nk;, sot = j. This proves the Claim. 

Finally, the Claim gives |HK| = m|H| =|K: N||H| = fa |H|, as required. @ 
We now give two generalizations of Theorem 6 §2.8—see (*) above; the first 

drops the condition that HN K = {1}. 


Theorem 3. If H <G and K <G then # i ~ Wr x Hoke 


Proof. Write N = HK for convenience, and define 


a:HK+#x by a(hk) = (Nh, Nk). 


Then a is well defined because hk = hik, means hy'h =kkl!e€HNK=N. 
Hence, hN = h1N, so Nh = Nh, because N < G. Similarly, Nk = Nky. 


Claim. @ is a homomorphism. 


Proof. Write x=hk and y=hik, in Hk = KH, and write khy = hake in 
HK. Then zy = hhg kek; so a(ay) = (Nh Nhe, Nk2 Nky). On the other hand, we 
have a(x)a(y) = (NA Nhi, Nk Nk), so we must prove that Nh Nhy = Nh Nhy 
and Nky Nk; = Nk Nk, equivalently (since G/N is a group) Nhg = Nh; and 
Nko = Nk. But we have kh, = hake, so 

hyhz! = kl (hakghg') Ee HN K = N because K < G, and 

kok7} = hal (khik7} JE KNH=N because H AG. 
Hence, Nhe = Nh, and Nkz = Nk, which proves the Claim. 


With the Claim, a is an onto homomorphism, so we are done by the isomorphism 
theorem because kera = {hk | Nh = N and Nk= N}=N. a 


Our second generalization of Theorem 6 §2.8—see (*) above; is to extend it to 
more than two factors. We require the following result, interesting in itself. 


Lemma 2. The following are equivalent for subgroups Gy, Go,...,Gn of a group: 
(1) (GiG2...Gh_-1) Ge = {1} for each k = 2,3,...,n 
(2) If g1g2-+:9n =1, where g; € G; for each i, then g; = 1 for each 1. 


352 8. p-Groups and the Sylow Theorems 


Proof. Given (1), if gig2-+:gn = 1 then g, € (GiG2...Gn1) Gn = {1}, 809, =1 
and gig2-::9n-1 = 1. Now repeat the procedure to obtain g,-1 = 1. Continue to 
prove (1) = (2). The proof that (2) = (1) is left to the reader. | 


Call subgroups G1, G2,...,Gp unconnected if the conditions in Lemma 2 are 
satisfied. Using this notion, we now give a useful characterization of when a group 
is isomorphic to a direct product of finitely many subgroups. 


Theorem 4. Let G be a group and assume that G= GiG2:--G, where the G; 
are subgroups. Then the following conditions are equivalent: 


(1) The G; are unconnected and G;, < G for each k. 
(2) The G; are unconnected and gig; = 939: when 9; € Gi, 9) € G; andi #). 
(3) (Gy... Ga-1Ggai-..Gn) AG, = {1} and G, 4G for each k. 
(4) (Gi... Gg-1G@ey1-..Gn) Gy = {1} for each k and gig; = 939; whenever 
9 € Gi, 93 € G; andi # j. 
In this case G = G, x G2 x +--+ x Gy, and each g € G is uniquely represented as a 
product g = 9192°':9n, where g; € G; for each i. 


Proof. (1) = (2). Given (1), let 9: € Gi, 9; € G; and i # j. Assume i < j, so that 
GiNG; C (GiG2...G;-1) NG; = {1}. If we write a= 9959: 95° then a € G; 
because gjg>'g;* € Gi <1 G. Similarly a € Gj, soa € G;NG; = {1}. Hence a = 1, 
80 9:9; = 959i, a8 required. 

(2) = (1). Let b€ Gy, a € G; we want alba € Gy. If a = ajag:+- an, a € G; 
then, since b commutes with each of aj+,az*,...,a,", by (2), we have 


a7 tba = (az +++ ag")b(aR +++ dn) = an) ++ -az ty (az ban )angi +++ On. 


But aj, ‘bax €G, and so commutes with each of @x41,...,@n. It follows that 
a-'ba = a;'bag € Gx, 0 Gx dG. This proves (1). 
(3) < (4) and (4) = (2). Both (3) and (4) imply that the G; are unconnected. 
_ (2) = (4). Choose by € Gp (Gi +++ Ge_-1Gu4yi-++Gy) and write by = a1 +++ Gp-1 
‘Ak41°** An with a; € G; for each 7. Since by commutes with these a;, this gives 
l= 104-10, a41 +++@n, 80 by = 1 by (2) and Lemma 2. This proves (4). 


Now define 9:G, x G2x*+::XG,z—->G by 6(91,92,-+-59n) = 9192°'* Gn: 
This is onto because G = G,G2---G,, and it is a homomorphism because the 
G;, commute elementwise. (Note, this does not require that the G; are abelian.) 
As @ is one-to-one (because the G;. are unconnected), @ is an isomorphism, and so 
G&G, x Go x +++ x Gy. Finally, if g = a1aq+++an = bybo:++ bn, ay € Gy, by € Gi; 
we must show a;=0; for all 7. First (by 1a1)a2a3 +++ Gn = b2b3+++by. Since bg 
commutes with (by'ai) by (2), we get (by'a1)(bp1a2)ag +++ Gn = bg-++ bp. Continue 
to get (bya1)(bz1a2)-+- (bz tan) = 1. Hence, each b;*a; = 1 by Lemma 2, proving 
the last sentence of the Theorem. Bi 


The Correspondence Theorem 


If a: G— H is an onto group homomorphism, the group H = a(G) is called a 
homomorphic image of G, or simply an image of G. Thus, the image a(G) enjoys 
any property of G that is preserved by homomorphisms, for example, being abelian 


8.1. Products and Factors 353 


or cyclic. The image is a simplified version of the group and so is easier to study. 
The idea is to learn about the group G by investigating its homomorphic images. 

The isomorphism theorem (Theorem 4 §2.10) provides a fundamental tool for 
investigating a homomorphic image a(G) of a group G. It asserts that a(G) is 
isomorphic to a factor group of G, indeed that a(G) = G/K, where K denotes the 
kernel of a. On the other hand, if K is any normal subgroup of G, the factor group 
G/K is a homomorphic image of G via the coset homomorphism G — G/K given 
by g+ Kg. Thus, studying the images a(G) of G is the same as studying the 
factors G/K of G. The factors of G are very closely connected to G itself and we 
now focus on these factors and how to use them to study the group G. 

Because many properties of G can be described in terms of the subgroups of G, 
we need to be able to obtain information about these subgroups from knowledge 
of the subgroups of a factor group G/K. The next theorem provides a method of 
doing this. It gives a very useful correspondence between the set of subgroups of G 
that contain K and the set of all subgroups of G/K. Moreover, the correspondence 
is such that, if we can determine all the subgroups in one of these sets, we can easily 
compute the subgroups in the other set. 

To show how this correspondence works let K <1 G, and consider the set of 
subgroups H of G such that K C H. For any such H, define 


H/K ={Kh|he H}. 


Lemma 38. If K C H CG are groups and K <1 G, then 
(1) H/K is a subgroup of G/K. 
(2) Ifk C H; CG, then H C Hy if and only if H/K C H,/K. 
Proof. (1) This is because H/K is the image of H under the coset map G > G/K. 
(2) Clearly H C Hy implies H/K C H,/K. Conversely, assume H/K C Hi/K, 
and let he H. Then Kh € H/K = H,/K, say Kh = Kh, for some hi € Hy. As 
K C Hy, this gives hh € Khi C Hyhy C Hy, so HC Ay. 0 


Theorem 5. Correspondence Theorem. If K < G are groups, define a map 
0:{H|K CH CG, H is a subgroup} — {H | HC G/K is a subgroup} 


by O(H) = H/K. Let H, H, and Ho be subgroups of G containing K. Then: 
The map O is a bijection. 


) 
) 
3) © preserves containment: H CH, if and only if H/K C Hi/K. 
4) © preserves normality: H < H, if and only if H/K < H,/K. 

) 


Proof. (1) The map 9 is one-to-one by Lemma 2, and it is onto by (2). 

(2) Given a subgroup 7 C G/K, define H = {hE G| Kh € H} as in (2). Then 
H is a subgroup of G and K C H. Moreover H C H/K because Kg € H implies 
that g € H, whence Kg € H/K. Conversely, if Kg € H/K then Kg = Kh for some 
heHso,asKkK CH, ge€ KhC AAC H. But then Kg € H by the definition of H, 
and we have shown that H/K C H. Hence H/K =H, as required. 


354 8. p-Groups and the Sylow Theorems 


(3) This restates Lemma 2. 
(4) Assume H/K < H,/K. To show H < Aj, let h € H and hy € Hy. Then 


K(hyhhy') = Kh, Kh(Khi)7 € Kh, (H/K) (Khy)7} C A/K. 


Hence K(hihhy*) = Kh’ for some h’ € H, so hihhy| € KH C H because K C H. 
As h € H was arbitrary, this shows that H < Hy. 

Conversely, let H < Hy. Then (Khi)"'Kh Kh; = K(hy'hh1) € H/K for all 
hi € Hi, h © H, because hythhi € H. As hE H was arbitrary, this shows that 
(Kg)1(H/K)Kg C H/K. Hence H/K < H,/K, as required. 

(5) and (6) These are left as Exercises 9 and 10. a 


If K <G, the bijection H + H/K in Theorem 5 pairs all subgroups H D K of 
G with all subgroups H/K of G/K. Not only is this correspondence a. bijection, it 
also pairs normal subgroups with normal subgroups by (4) and preserves inclusion 
by (3). This last fact means that the lattice diagram of all subgroups of G/K 
has the same form as the lattice diagram of all subgroups of G that contain K. In 
particular, the bijection pairs G with G/K, and it pairs K with K/K = {K}—the 
trivial subgroup of G/K. This is illustrated in the following two examples. 
Example 1. Let G = (a), where o(a) = 12, and let K = (a®) and Ky = (a*) . Draw 
the lattice diagram of all subgroups of G, and use the correspondence theorem to 
obtain the lattice of all subgroups of G/K and G/K}. 


Solution. The subgroups of G are given by the divisors of 12, and the subgroup lat- 
tice for G appears on the left in the diagram (see Example 14 §2.4). The subgroups 
of G/K are thus determined (using the correspondence theorem) by the subgroups 
G, (a), (a®), and K of G that contain K. Thus the lattice diagram for G/K 
shown in the center diagram can be “read off” from the diagram for G. 


G 


; G/K G/Ky 


(a?) a | 
Cae id (@?)/K (a) /K (a?) / Ky 
Ky K a | 
X\ we {K} {i} 
(1) 


Similarly G, (a?) and ky are the only subgroups containing Ky, and they give the 
subgroup lattice for G/F, on the right. O 


Example 2. Consider the octic group G = D4 = {1, a, a”, a3, b, ba, ba”, ba®}, where 
o(a) = 4, o(b) = 2 and aba = b. If K = {1,a?}, determine all the subgroups H of G 
such that kK CH CG. 


Solution. By Example 4 §2.9, K = Z(G), and G/K = {K, Ka, Kb, Kba} = Ka, the 
Klein group. Hence, the subgroups of G/K are {K}, G/K, and 


H, = (Ka) ={K, Ka}; Ho = (Kb) = {K, Kb}; Hs = (Kba) = {K, Kba}. 


The subgroup lattice diagram of G/K is shown at the left in the diagram. 


8.1. Products and Factors 355 


Pa Pate 
Hr He Hs A, 4H, #3 
Nae Sy 


The correspondence theorem ensures that, for each 7, there is a unique subgroup 
Hi; of G such that K C H; and H; = H;/K. Explicitly, (2) of Theorem 5 gives 


={g9¢G| Kg € Ho} = {1,b, a?, ba}, 
Hz = {g € G| Kg € H3} = {1, ba, a”, ba}. 
The correspondence theorem also shows that these are the only subgroups H such 
that K C H CG, and that the lattice of such subgroups (in the right diagram) has 
the same form as the subgroup lattice of G/K. Furthermore, the fact that H1, He, 
and 3 are normal in (the abelian group) G/K guarantees that H,, Ho, and H3 are 
normal in G. (Of course this also follows because they are of index 2 in G.) O 


An important special case of the correspondence theorem describes when a factor 
group is simple. If K <j G, the group G/K is simple if and only if the only normal 
subgroups are the trivial subgroup K/K = {kK} and the whole group G/K. Hence, 
the correspondence theorem shows that the only normal subgroups H such that 
KCHCG are H=K and H=G. A normal subgroup K #G with this 
latter property is called a maximal normal subgroup of G. This discussion is 
summarized in 


Theorem 6. A normal subgroup K < G is a maximal normal subgroup if and only 
if G/K is simple. 


Every finite group G # {1} has maximal normal subgroups—choose any proper 
normal subgroup (possibly {1}) of maximal order. Hence, G has finite simple factor 
groups by Theorem 2, which shows that finite simple groups are quite common. In 
fact they serve as “building blocks” by which we can study the structure of finite 
groups in general. We return to this topic in Chapter 9. 

The correspondence theorem describes the subgroups of a factor group G/K, 
where K < G. We now turn to the images of G/K. Of course they are all images 
of G, and so have the form G/H for some H < G. They are described next by an 
important consequence of the isomorphism theorem that will be needed later. 


Theorem 7. Third Isomorphism Theorem. Let K C H C G be groups, where 
K<1G and H <G. Then H/K <G/K and 

G/K 

—__ £G/H. 

H/K GI 
Proof. Define a: G/K — G/H by a(Kg) = Hg for all g in G. This is well defined 
because Kg = Kg; implies ggj' € K C H, whence Hg = Hgy. With this it is easy 
to verify that a is an onto homomorphism, and ker(a) = {Kg| Hg =H}=H/ 
Hence, the isomorphism theorem (Theorem 4 §2.10) completes the proof. 


Our final example provides a good illustration of how the second and third 
isomorphism theorems are used. A group G is called a metacyclic group if a 
normal subgroup K <i G exists such that both K and G/K are cyclic. Every cyclic 


356 8. p-Groups and the Sylow Theorems 


group is metacyclic (take K = {1}) as is D, (take K to be the cyclic subgroup of 
index 2). Thus, D, is metacyclic but not cyclic. 


Example 3. Show that every subgroup and every image of a metacyclic group is 
again metacyclic. 


Solution. Let G be metacyclic, say K and G/K are both cyclic where K 4G. 

If H is asubgroup of G then HN K < H, and HN K is cyclic (being a subgroup 
of the cyclic group K). On the other hand, H/(H NK) = HK/K by the second 
isomorphism theorem, and HK/K is cyclic (it is a subgroup of G/K). It follows 
that H/(H 1K) is cyclic, whence H is metacyclic. 

Now let G/N be any image of G, where N dG. Then NK <G by Theorem 
1, so NK/N «G/N. Moreover, NK/N is cyclic because NK/N = K/(NOK) 
is an image of the cyclic group K. On the other hand, (G/N)/(NK/N) = G/NK 
is cyclic because G/NK & (G/K)/(NK/K) is an image of the cyclic group G/K. 
This means that G/N is metacyclic. O 


Exercises 8.1 


3 =¢=7? and oro =T. Compute XY if: 


1. Let $3 = {e,0,07,7,T0,T07} where o 
(a) X = {r, ro} and Y = {1,707}. 
(b) X = {o,ro} and Y = {0,07}. 

2. If a: G — Cg is an onto group homomorphism and |ker(a)| = 3, show that |G| = 18 
and G has normal subgroups of orders 3, 6 and 9. 

3. Use the correspondence theorem to show that each subgroup H of G with G’ C H 
is normal in G. (See Theorem 3 §2.9.) 

4. In each case use Theorem 5 to find all subgroups of G that contain K. 

(a) G= Dg and K = Z(De). 
(b) G=Q and K = Z(Q). 
(c) G= Ag and K = fe, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}. 

5. In each case describe all maximal normal subgroups of G. 
(a)G=Z (b) G is cyclic, |G/ =n 
(c) G@= Dyo (d)G=Q 

6. Let K <G be such that both K and G/K are simple. Show that either K is the 
only proper normal subgroup of G, or G = K x (G/K). 

7. Let K dG and assume that G/K is cyclic, |K|=k, and |G| =n. If m is an 
integer such that k|m and m|n, show that there is a unique subgroup H such that 
KOCH CG and |H|=™. [Hint: Theorem 9 §2.4.] 

8. If K <G, show that the following conditions are equivalent. 

(1) The only subgroups H such that K CHC GareH=K and H=G. 
(2) G/K is cyclic and of prime order. 
9. Show that the correspondence theorem preserves intersections. More precisely, if 
K CH, H; CG, H, Hy, subgroups, K < G, show that (H/K)N(Hi/K)=(HN W4)/K. 
10. Show that the correspondence theorem preserves products. More precisely, if we 
have K CH CG and K C H, CG where K 4G, H, H, subgroups, show that 
(a) (H/K) - (H;/K) is a subgroup of G/K if and only if HH, is a subgroup of G. 
(b) In this case (H/K) - (Hy /K) = (HHj)/K. 
11. If X and Y are nonempty subsets of some group, show that (X)(Y) C (X UY), 
with equality if and only if (X) (Y) = (Y) (X). [Hint: Lemma. 2 §2.8.] 


12. 


13. 


14, 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23, 


24, 


8.2. Cauchy’s Theorem 357 


(a) If H is a subgroup of a group G, show that H? = H. 

(b) If X #@ is a finite subset of G, show that X is a subgroup if and only if 
X? CX, 

Let G be a group with |G] = pgr, where p, g and r are distinct primes. If H and 
K are subgroups, |H|= pq, and |K|= qr, show that |HN K|=q. [Hint: Lagrange’s 
theorem. | : 

Let G = (g) be acyclic group, and let A = (g*) and B = (9°) . Show that AB = (9%) 
where d = gcd(a, b). 

Let K, A and B be subgroups of G with K < G and A< B. Show that KA KB. 

Let H, K and M be subgroups of a group G, and assume that H C M. If both 
HOK=MOK and HK = MK hold, show that H = M. [Hint: First show that 
M=(HK)NM)]) 

Let |G| = p"m where p is a prime and p does not divide m. If K dG satisfies 
|K| =p", show that K is the only subgroup of G of order p”. [Hint: Theorem 1.] 

If G is a group, M <G is maximal normal, K <4 G, and K ¢ M, show that 

(a) G= KM. 

(b) K/(K NM) is simple. 

(c) G/(K MM) has a simple direct factor. 

A group G is called a metabelian group if K 4d G exists such that both K and 
G/K are abelian. 

(a) Show that every subgroup and factor group of a metabelian group is metabelian. 

(b) Show that G is metabelian if and only if the commutator subgroup G’ is abelian. 

Let C be a nonempty family of groups closed under taking subgroups and images 


- (for example the abelian or the cyclic groups). Call a group G meta-C if there exists 


K<4G such that both K and G/K are in C. Note that every group in C is meta-C 
because the trivial group {1} is in C. If G is meta-C show that every subgroup and 
factor group of G is meta-C. 

Let G be a group with subgroups H and K. Assume that |H| = pg and |K| = 9’, 
where p # q are primes. If |G] < pq’, show that |HN K|=q. 

Let G be a finite abelian group. 

(a) If G has two distinct elements of order 2, show that 4 divides |G. 

(b) If G has three distinct elements of order 3, show that 9 divides |G]. 

If G is a group, let M denote the monoid of nonempty subsets of G, and identify 
GC M by writing g = {g} for each g € G. Show that G is the group of units of M. 
Let G be a group, let Sg be the group of permutations of G, and write A = aut(G). 
If 72:G—G is defined by 7a(g) = ag for all g EG, let G= {t. |a € G} be the 
group of translations. Thus G & G by Cayley’s theorem (Theorem 6 §2.5). 

(a) Show that GA isa subgroup of Sg called the holomorph of G. 

(b) Show that GN A = {1c}. 

(c) Show that GAGA. 

(d) Show that GA/G2 A. 


8.2 CAUCHY’S THEOREM 


If p is a prime and G is a group of order p”, every element of G has order a power 
of p by Lagrange’s theorem. The converse is also true. If every element of a finite 
group G has p-power order, then |G| = p” for some n > 0. The proof of this result 


358 8. p-Groups and the Sylow Theorems 


requires several theorems that are important in themselves and reveal many other 
properties of groups of p-power order. 

Recall that two subgroups H and K of a group G are called conjugate in 
G if K =gHg™ for some g € G. This relation is an equivalence on the set of all 
subgroups of G, and the analogous equivalence on the elements of G is an important 
tool in this section. Thus, two elements a and b of a group G are said to be 
conjugate in G if b = gag™! for some g € G. This is an equivalence on G and the 
equivalence class of a € G is denoted 


classa = {z € G | x is conjugate to a} = {gag-! | g € G}, 


and is called the conjugacy class of a. 

Hence the conjugacy classes partition a group G. Clearly, class1 = {1} in any 
group and, more generally, classa = {a} if and only if a€ Z(G). Also, if a and 
b are conjugate, then o(a) = o(b) because gag™! is the image of a under an inner 
automorphism of G. Hence, all elements in a conjugacy class have the same order. 


Example 1. Partition D3 into conjugacy classes. 


Solution. Let D3 = {1,a,a7,b, ba,ba?}, where o(a) =3, o(b)=2, and aba=b. 
We have class1 = {1}. As a and a? are the only elements of order 3, we have 
classa C {a,a?}. But a? =bab~!, so classa = {a, a}. Similarly, class b = {b, ba, ba} 
because aba™! = ba and a”ba~? = ba?. 


It can be shown (Exercise 15) that two permutations in S, are conjugate if and 
only if they have the same cycle structure; that is, when factored into disjoint 
cycles they have the same number of cycles of each length. 


Example 2. The conjugacy classes of Sy are 


classe = {e} 
class(1 2) = {(1 2), (1 3), (1 4), (2 3), (2 4), (3 4)} 
class(1 2)(3 4) = {(1 2)(3 4), (1 3)(2 4), (1 4)(2 3)} 
elass(1).2-3)) = 4(F 2. 8)p {EF -3 2), 1 2 4), (1 4-2), 
(1 3 4), (1 4 3), (2 3 4), (2 4 3)} 
class(l 2 3 4) = {(1 2 3 4), (1 2 4 3), (1 3 2 4), 


(B42) GAO 8) aa 9)) 


If K is a normal subgroup of G, then gKg-! = K for all g € G, and so K 
contains the conjugacy class of each of its elements. Conversely, any subgroup that 
is a union of conjugacy classes must be normal (Exercise 5). This proves 


Theorem 1. If H is a subgroup of a group G, then H <G if and only if H is a 
union of conjugacy classes. 


If D3 = {1,a,a,b,ba,ba?}, as in Example 1, Theorem 1 shows that any 
normal subgroup K of D3 must be a union of the conjugacy classes {1}, {a,a?}, and 
{b, ba, ba?}. Because 1 € K and |K| divides |D3| = 6, the only normal subgroups of 
Dg are {1}, {1,a, a7}, and Ds. Similarly, Example 2 gives Example 3 (Exercise 17). 


Example 3. The normal subgroups of S4 are {e}, A4, S4, and 


K={e,(1 2)(3 4), (1 3)(2 4), (2 4)(2 8)}. 


8.2. Cauchy’s Theorem 359 


The relationship between conjugacy classes and normality is even closer than 
that shown in Theorem 1. If X C G is a nonempty subset, write 


N(X) = Ne(X) = {9g € G| gXg 1 = X}. 


This is a subgroup of G for every X (Exercise 12), called the normalizer of X in 
G. We write N(X) = Ng(X) if the group G must be emphasized, and we abbreviate 
N({a}) = N(a) for a€ G. Note that N(a) = {g € G| ga = ag}. For this reason, 
N(a) is often called the centralizer of a in G. The normalizer of a subgroup has 
the following properties which explain the name. 
Lemma 1. Let H be a subgroup of a group G. 

(1) H<a N(#) 

(2) If H < K, where K is a subgroup of G, then K C N(H). 


Proof. Let H <a K and k € K. Then kHk71 = H, sok € N(H). Thus, K C N(H) 
proving (2). If we take K = H in (2), we get HC N(H), whence H< N(H). @ 


We can summarize Lemma 1 by saying that N(#) is the largest subgroup of G in 
which H is normal. In particular, H < G if and only if N(H) =G. At the other 
extreme, it can happen that N(H) = H (consider H = {e,(1 2)} in S3). 

Much of the importance of normalizers stems from their connection with 
conjugation. Recall that |G : H| denotes the index in G of a subgroup H C G. 


Theorem 2. Let G be a finite group. 
(1) |class a] = |G: N(a)| for eacha eG. 
(2) The number of conjugates of a subgroup H of G is |G : N(H)|. 


Proof. We prove (1); (2) is analogous (Exercise 13). Write N(a) = N. The index 
IG: N| =|{gN | 9 € G}]. Since classa = {gag™* | g € G}, define a mapping 
y:classa—>{gN|gEG} by (gag!) =gN. 


It suffices to prove that is a bijection. Now N = {x € G| az = xa}, so we have 


gag t+=hah+ «= (h7'g)a=a(h-!g) 
& hlgeN 
= gN=hN 


This shows that y is well defined and one-to-one; as ¢ is onto, this proves (1). & 


Combining Theorem 2 with the fact that classa = {gag™+ | g € G} gives 
ae Z(G) > clasa={a} = N(a)=G. 


In partic#ilar, the center Z(G) is the union of all the singleton conjugacy classes. 
This leads to the following useful theorem. 


Theorem 8. The Class Equation. Let G be a finite group and let class ay, 
class a2,...,classa,, be the nonsingleton conjugacy classes. Then 


IG| =|Z(@)| + s IG: N(ai)|. 


360 8. p-Groups and the Sylow Theorems 


Proof. The conjugacy classes partition G, so |G| is the sum of the sizes of these 
classes. But the number of elements in classa; is |G: N(a;)| by Theorem 2, and 
|Z(G)| is the number of singleton classes. The class equation follows. a 


Example 4. Consider the quaternion group Q = {1,—1,i,—i,j,—-j,k, —k} as in 
Example 9 §2.8. The conjugacy classes are {1}, {—1}, {i, —<}, {7, -j}, and {k, —k}. 
We have N(i) = {1,—-1,i, —i} so that |Q : N(i)| = 2 =|classi|, as in Theorem 2. 
Because Z(Q) = {1, —1}, the class equation is apparent. 


The class equation is reminiscent of Lagrange’s theorem in that it provides arith- 
metic information about the group. That Lagrange’s theorem is useful is beyond 
doubt; the usefulness of the class equation lies in the fact that each term |G : N(a)| 
is a divisor of |G] which is not equal to 1 when a ¢ Z(G). This fact is particularly 
useful when |G| is a prime power as we shall see. 

However, before doing so we use the class equation to prove an important the- 
orem about general finite groups—due to A. L. Cauchy. If G is a finite group, the 
order of each element divides |G'| by Lagrange’s theorem. The converse is false. For 
example, |A4| = 12 but A, has no element of order 6. However, a partial converse 
does hold. 


Theorem 4. Cauchy’s Theorem. If a prime p divides the order of a finite group 
G, then G has an element of order p. 


Proof. If G is abelian, a (self-contained) proof has already been given (Lemma 1 
§7.2). In general, we use induction on |G|. The theorem is easily verified if |G| < 3. 
If |G| > 3, let classa;,...,classa, denote the nonsingleton conjugacy classes so 
that |N(a:)| < |G|. If p divides | N(a;)| for any 2, the proof is complete by induction. 
Otherwise, p divides |G : N(a;)| for each 7, and hence p divides |Z(G)| by the class 
equation. As Z(G) is abelian, Lemma 1 §7.2 completes the proof. a 


As with many important theorems, the method of proof of Cauchy’s theorem 
is. at least as important as the result itself. In Section 8.3, we present a sweeping 
generalization of the class equation, which yields a wealth of information about 
finite groups. 


p-Groups 


We use Cauchy’s theorem frequently below. One of the most important applications 
is to characterize groups of prime power order. If p is a prime, a group G is called 
a p-group if the order of every element of G is a power of p. 


Lemma 2. If G is a finite group and p is a prime, then |G| is a power of p if and 
only if G is a p-group. 


Proof. Assume that o(g) is a power of p for all g € G. If |G is not a power of p, let 
q divide |G|, where g # - is a prime, Then Cauchy’s theorem shows that G has an 
element of order g, contrary to hypothesis. Hence, |G| is a power of p. The converse 
follows by Lagrange’s theorem. a 


Thus, Lemma 2 characterizes the finite p-groups. The next result holds for all 
p-groups, finite or not, and we leave the routine proof as Exercise 21. 


8.2. Cauchy’s Theorem 361 


Theorem 5. Let K C G be groups with K <G and let p be a prime. Then G is 
a p-group if and only if both K and G/K are p-groups. 


Although infinite p-groups exist (Exercise 23), we focus on the finite case. The- 
orem 6 is fundamental, and the proof provides a good illustration of how to use the 
class equation. 


Theorem 6, If p a prime and G + {1} is a finite p-group, then Z(G) # {1}. 


Proof. Let classa,,...,classa, denote the nonsingleton conjugacy classes in G. 
Because N(a;) # G for each i by Theorem 2, and because |G : N(a;)| divides |G] 
for each 7, it follows that p divides |G : N(a,)| for each i. But then p divides |Z7(G)| 
by the class equation; in particular Z(G) # {1}. | 


Theorem 6 is very useful in the study of p-groups where p is a prime. We give 
two applications; the first characterizes of all groups of order p?. 


Theorem 7. If G is a group and |G| = p* where p is a prime, then G is abelian 
and either G & C,2 or G = Cy x Cp. 


Proof. To prove that G is abelian, we show that Z(G) =G. As Z(G) # {1} by 
Theorem 6, it suffices to show that |Z(G)| = D is impossible. But, if it holds, then 
G/Z(G) is cyclic (it has order p), which implies that G is abelian by Theorem 2 §2.9, 
a contradiction. Hence |Z(G)| #p, so Z(G) = G and G is abelian. Now assume 
that G is not cyclic so that every element g satisfies g? = 1. Choose a # 1 in G and 
write H = (a). Then choose b ¢ H and write K = (b) . Because |K| = p = |H]|, we 
have HN K = {1}, so HK =H x K by Theorem 3 §8.1. Hence |HK| = p? = |G|, 

soG=HK=HxK=C, x Cp. | 


The extension of Theorem 7 to groups of order p? is false: If p = 2, the nonabelian 
groups D4 and Q both have order 23. More generally, if p is an odd prime, Exercises 
30 and 31 give nonabelian groups G, and G2 of order p® such that g? = 1 for all 
g € Gy, and G2 contains an element of order p’. 

The next result shows that, although a finite p-group need not be abelian, it 
has an abundance of normal subgroups; in fact, it has one of every possible order. 
The proof again depends on Theorem 6 and provides a tour de force through the 
methods we have developed for dealing with finite groups. 


Theorem 8. Let G be a finite p-group of order p". Then there exists a series 
G=659 G1 >°656,=4) 


of subgroups of G such that G; < G, |G;| = p™*, and |G;/Gj41| = p for all i. 


Proof. The existence of such a series is obvious if n = 1, so we proceed by induction 
on n. If |G| = p"*+, we have Z(G) # {1} by Theorem 6. By Cauchy’s theorem, 
choose a € Z(G) such that o(a) =p, and write G, = (a). Then G, dG and 
G/G» has order p”~? so, by induction, let (G/G,) DX, D-+: D Xn = {Gn} 
be a series of subgroups of G/G, such that X; <4 G/Gy and |X;/Xi41| = p for each 
i. The correspondence theorem ({heorem 5 §8.1) ensures that each X; has the form 
X;, = Gi/Gn, where G; <4 G and )G;/G,| = p'-)-*. Furthermore, X; D Xiq1 im- 
plies that G; D Gi4i, and G;/Gi41\@ X;/Xi41 by the third isomorphism theorem 
(Theorem 7 §8.1). Hence, G D G; D\-- D G,, D {1} is the required series for G. B 


362 8. p-Groups and the Sylow Theorems 


Note that Theorem 8 shows that if G is a p-group and |G| = p”, n > 1, then every 
subgroup H C G is contained in a subgroup M with |M| = p”"!. Such subgroups 
must be normal as we shall see in the Corollary to Theorem 1 §8.3, so G/M = C, 
and M is maximal. 

The existence of a series of subgroups such as that in Theorem 8 gives important 
information about the group. Such series are studied in Chapter 9. 


Augustin Louis Cauchy (1789-1857) Cauchy was certainly one of the great math- 
ematicians, and it is said that he and his contemporary Gauss were the last to know 
all the mathematics of their time. But, unlike Gauss, Cauchy published profusely (sur- 
passed only by Euler and Cayley), and produced 789 papers on topics as diverse as 
optics, elasticity, differential equations, mechanics, determinants, permutation groups, 
and probability. He was the effective founder of the theory of functions of a complex 
variable. In addition he wrote three classic textbooks on analysis in which he firmly 
established standards of rigor that are now accepted by all analysts and carry down to 
today’s calculus texts. We owe our modern notions of limit and continuity to him. In 
algebra, Cauchy is remembered as the first to formulate earlier work with permutations 
in an abstract way and so to create a formal theory of groups of permutations. This 
work led Cayley (in 1854) to the modern notion of an abstract group. 


Cauchy was born in Paris and, after a stellar career in school, enrolled as an engineer 
in Napoleon's army. He continued his mathematical research and, at the age of 26, 
became a professor at the Ecole Polytechnique. He soon established himself as the 
leading mathematician in France. He also enjoyed teaching, and this pedagogical bent 
probably accounts for the influence his books had. 


Exercises 8.2 


1. In each case partition G into conjugacy classes and find all the normal subgroups. 
(a)G=D, (b) G=Q 
2. Partition D, into conjugacy classes where n is odd. [Hint: All elements of order 2 
are conjugate. | 
3. Suppose that |a| =n in a finite group G. If a™ is conjugate to a in G, show that 
gcd(m,n) = 1. [Hint: If gag-1= a™, show first that g2ag-2= a™” | 
4. Show that ab and ba are conjugate in any group. 
5. If a subgroup Z of G is a union of conjugacy classes in G, show that H <a G. 
6. If H is a subgroup of prime index in a finite group G, show that either H <4 G or 
N(A) =H. 
7. If H and K are conjugate subgroups in G, show that N(H) and N(K) are conjugate. 
8. If G is a group, let K = (classa) where a € G. Show that K 4G. 
9. If a finite group G has an element with exactly two conjugates, show that G is not 
simple. 
10. If G is a finite group and H # G is a subgroup, show that G # ere aHa-?. [Hint: 
Theorem 2.] 
11. If H is a subgroup of G of finite index, show that H has only finitely many conjugates 
in G. [Hint: Exercise 31 §2.6.| 
12. Show that N(X) is a subgroup of G for each nonempty subset X of G. 
13. Prove (2) of Theorem 2. 
14, Let Dz = {i,a,a7,b, ba, ba?}, where o(a)= 3, o(b)= 2, and aba=b. If H = {1,0}, 
show that N(H) = H. 


15. 


16. 


17. 


18. 
19. 
20. 


21. 
22. 


23. 


24. 


25, 


26. 


27. 


28, 


29, 


30. 


31. 


32. 


33. 


8.2. Cauchy’s Theorem 363 


Use Lemma 3 §2.8 to show that two permutations are conjugate in S,, if and only if 
they have the same cycle structure. 

Ify=(1 2 3 4) and 6=(1 2 3) in S4, compute N(y) and N(6). [Hint: Pre- 
ceding exercise. | 

Write K ={e, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}. 

(a) Show that the only normal subgroups of 94 are {e}, K, Aa, and S4. [Hint: 
Exercise 15.] 

(b) Show that the only normal subgroups of A, are {e}, K, and Aq. [Hint: Exercise 
7 §2.8 and Lemma 2 §2.8.] 

If n > 5, show that {e}, An, and S, are the only normal subgroups of S,. [Hint: 
Theorem 8 §2.8 and Exercise 7 §2.8.] 

If G is a finite group with exactly two conjugacy classes, show that |G| = 2. 

If G is a group and a€G, define M(a) = {g € G| |[g,a] € Z(G)}, where we write 
(g,a] = gag™*a+ for the commutator. Show that M(a) is a subgroup of G and that 
there is a homomorphism M(a) > Z(G) with kernel N(a). 

Prove Theorem 5. 

Let G be a finite group. If p is a prime, show that G has a normal! subgroup of index 
p if and only if p divides |G/G"|, where G’ is the commutator subgroup. [Hint: If p 
divides |G/G'| apply Theorem 8 to the p-primary component of G/G" ] 

Let G” be the group of sequences [g;) = (gar d1ct) from a group G, with compo- 
nentwise multiplication [g;) -[hi) = [gihi). (See Exercise 37 §2.10). Show that, if 
G # {1} is a finite p-group, then G” is an infinite p-group. 

If G is a finite p-group, show that G’ # G. [Hint: Theorem 8.] 

If H <G, where G is a finite p-group and H # {1}, show that HN Z(G) $ {1}. 
[Hint: Theorem 1.] 

Let G be a nonabelian group of order p*, where p is a prime. Show that 

(a) Z(G) = G’ and this is the unique normal subgroup of G of order p. 

(b) G has exactly p? + p—1 distinct conjugacy classes. 

Let G be a finite p-group and let H<G. If |H|=p™ and |G| =p", strengthen 
Theorem 8 by showing that a series G= Gp) DG, D---D Gn = {1} exists such 
that G; 4 G and |G;/Gi41| = p for all i, and that Gz_m = H. [Hint: Exercise 25.] 
Let G be a group of order p” and let H1,--- ,H,, be the distinct subgroups of G of 
index p. If N= Hy N---N Hm, show that N < G and that xc? = 1 for every coset x 
in G/N. [Remark: In fact H; < G for each i (see the Corollary of Theorem 1 §8.3).] 
If H#G is a subgroup of a finite p-group G, show that H # N(H). [Hint: If 
C=coreH (Exercise 26 §2.8), let Z(G/C) = K/C, and show that K ¢ H and 
‘KC N(H).] 

Let K dG where G/K is a finite p-group, p a prime. Show that G has a normal 
subgroup of index p. 

If p is an odd prime, let G=Z, x Z, x Z, and define an operation on G by 
(x,y, 2) + (21,91, 21) = (w@t+a1,y+%1,2+ 21 —ye1). Show that G is a nonabelian 
group of order p* in which a? = 1 for alla € G. 

Let p be a prime and let X = {0,p,2p,...,(p—1)p} be the (additive) subgroup 
of Z,2 generated by p. Define an operation on the cartesian product G = X x Z,2 
by (x,y) (a1, y1) = («+ 21,y + y1 — yt1). Show that G is a nonabelian group of order 
p® that contains an element of order p?. 

A Gis called an FC-group if every conjugacy class is finite. Show that 

(a) If |G: Z(G)| is finite, then G is an FC-group. 


364 &. p-Groups and the Sylow Theorems 


(b) If G is a finitely generated FC-group, show that |G: Z(G)| is finite. [Hint: 
Exercise 33 §2.6.] 

(c) If G = (X), show that G is an F'C-group if and only if |class | is finite for all 
ce x. 

(d) Show that every subgroup and image of an F'C-group is an FC-group. 

(e) If G is any group, show that G* = {a € G | classa is finite} is an FC-group which 
is a characteristic subgroup of G. 


8.3. GROUP ACTIONS 


A mathematician, like a painter or a poet, is a maker of patterns. 


—Godfrey Harold Hardy 


If G is a finite group of order n, Cayley’s theorem asserts that there exists a one-to- 
one group homomorphism G — S,,. The proof proceed as follows. Given a € G, we 
define the multiplication map a, : G > G by oa(g) = ag for all g € G. We can easily 
verify that o, is a bijection and so belongs to the group Sg of all permutations of 
the set G. The proof is then completed by observing that the map G — Sq given by 
at Gq is a one-to-one homomorphism and so embeds G in the permutation group 
Sq. (Of course, Sg = S, because |G| = n). 

The action of the permutation o, : G — G is left multiplication by a. The key 
observation in this section is that there are sets other than G on which an element 
of G can act by multiplication. For example, if H is a subgroup of G of index m, 
let X = {gH | g © G} denote the set of all left cosets. Then for a € G we define 


Ta: X—>X by ta(g) =a(gH)=agH, for all gH in X. 


One verifies that 7, is a (well defined) bijection for each a € G and so 7, € Sx. 
- Moreover, Tap = TaT» because Ta,(gH) = abgH = a(bgH) = 7a[tT»(gH)| for all g. 
Since this holds for all a and b, the map 


y:G—Sx given by y(a) =, forallaecG 


is a group homomorphism. However, unlike the map in Cayley’s theorem, y may 
have a nontrivial kernel: 


kerry = {a€G|agH = gH for allg eG} 
= {a€G|g ‘ag € FH for allge G} 
{a €G|a€gHg" for all g € G} 


1 gHgt. 
gEéG 


This group is important enough to warrant a name. 
If AH is a subgroup of a group G, the core of H in G, denoted core H, is defined 
to be the intersection of all the conjugates of H in G; that is, 


coreH = {a€G|a€egHg" forallge G}= () gHg. 
géG 


Thus, core <G by the preceding discussion, and coreH C H because H is a 
conjugate of itself. Furthermore, core H is the largest normal subgroup of G that is 
contained in H. We record this fact for reference, and leave the proof as Exercise 9. 


8.3. Group Actions 365 


Lemma 1. Let H be a subgroup of a group G. 
(1) coreH < G and coreH C H. 
(2) If K aG and K CH, then K C coreH. 
Our present interest in core H comes from Theorem 1. 


Theorem 1. Extended Cayley Theorem. If H is a subgroup of finite index m 
in a group G, there is a group homomorphism 6: G > Sim, with ker @ = core H C H. 


Proof.. If X = {gH | 9 € G}, let p:G-— Sx be defined as above. As |X| =m, 
there is an isomorphism 6 : Sx — Sm, so we obtain a homomorphism dy: G > Sim, 
and ker 6y = ker p = core H by the preceding discussion. Then 0 = dy does it. @ 


This is Cayley’s theorem when H = {1}. Example 1 illustrates how to use it. 


Example 1. If |G| = 36 and G has a subgroup H of order 9,°° then G is not simple. 
Indeed, |G: H| =4 so, by Theorem 1, there is a homomorphism 0: G— Su, 
with ker@ C H. If ker@ = {1}, then G & 6(G), a contradiction as |G| = 36 and 
|0(G)| < |S4| = 24. So ker @ # {1} is normal in G. 


In Section 2.8, we showed that any subgroup of index 2 is normal. The next 
result gives a generalization that is especially useful for finite p-groups. 


Corollary. Let p be the smallest prime dividing the order of a finite group G. 
Then any subgroup of G of index p is normal in G. 


Proof. Let |G| = p*g™r”---, where p<q<r--: are primes. If |G: H| =p, 
then |H| = p*-1g™r"-.. . By Theorem 1 let 0: G — Sp be a homomorphism with 
ker@ C H and write K =ker@. If |K| = p*®-t~*og™—™or™—"0... | then we have 
IG/K| = pt+*ogmorno ... divides |S,| =p! and so p*g™or™... divides (p— 1)!. 
But this implies that ko = mo = no =--: =0 because every divisor of (p — 1)! is 
less than p. Hence, H = K 4G. a 


We give more applications of the extended Cayley theorem later; our present aim 
is to generalize it. The key to the theorem is the existence of the homomorphism 
yp: G— Sx, where G is a group and X is some set. Because the image y(G) of G 
is a subgroup of Sx, the natural place to begin is to consider this situation. 

Hence suppose that X is a nonempty set and that G is a subgroup of the group 
Sx of all permutations of X. For « € X and o €G, the element o(x) of X is 
specified, which amounts to a mapping G x X — X where (0,2) + o(z). We can 
now describe an apparently more general situation. 


Group Actions 


Let G be a group and let X be a nonempty set. A mapping G x X — X, denoted 
(a,z) ++ a-a, is called an action of G if it satisfies the following conditions. 

Al 1l-a2=2 forallae X. 

A2 a-(b-x) =(ab)-a for all « € X and for all a,beE G. 
In this case, G is said to act on X and X is called a G-set.% 


®9Such a subgroup H must in fact exist (see Theorem 1 §8.4). 
®°An action on the right may be defined by (a,z)1+ a*a€ X. This is nothing new because 
a:x=2+a7+ is then an action in the present sense. 


366 8. p-Groups and the Sylow Theorems 


Hence an action of G on X is nothing more than a multiplication of any element 
x of X by any element a of G to yield a (uniquely determined) element a- x of X 
that satisfies axioms Al and A2. There are many examples of such actions, and 
Example 2 recaptures the above discussion. 


Example 2. If X is any nonempty set and G C Sx is any group of permutations 
of X, define o-x = o0(z) for all x € X and o €G. Then axioms Al and A2 are 
clearly satisfied; in fact, A2 is the definition of composition of mappings. 


Example 3. Let H be a subgroup of a group G. Consider G as a set for the moment 
and let H act on G by h- x2 = ha for all x € G, h € H. This is clearly an action; 
and # is said to act on G by left multiplication. 


Example 4. If G is a group, let G act on itself by a: z = axa™ for all z € G and 
a € G. Then axiom Al is clear and A2 holds because 


a:(b- 2) =a(brb“!)a} = (ab)x(ab)“! = (ab) -z 
for « € G and a,b € G. In this case, G is said to act on itself by conjugation. 


Example 5. Let H be a subgroup of a group G and let X = {gH | g € G} denote 
the set of left cosets of H in G. Ifa € G and gH € X then a: (gH) = agH is well 
defined (verify) and so is an action of G on X. As we have shown, this action plays 
an essential role in the derivation of the extended Cayley theorem. 


Example 6. If X is any set and G is any group, we define the trivial action by 
a:e =a for allx € X anda €é G. Clearly, the axioms are satisfied. 


Examples 2-6 show that group actions are commonly occurring phenomena, 
and other examples below underline this conclusion. Lemma 2 isolates two useful 
properties of group actions that we use repeatedly. 


Lemma 2. Let X be a G-set, where G is a group, and let x,y € X anda,beEG. 
(1) Ifa-w=a-y, thenz=y. 
(2) a-w@=b-y if and only if (b-1a)-2=y. 


Proof. Clearly, (1) follows from (2) and axiom Al. Ifa.a=6-y, then 
(b4a)-2 = 07) (a-2) =b- (b-y) = (0) -yal-y=y, 
which proves half of (2). The other implication is proved similarly. | 


We can now give a natural generalization of the extended Cayley theorem. If G 
is a group, X is a G-set and a € G, define 


Og: XX by  og(x)=a-az forallae X. (@) 


Then Lemma 2 shows that og is a bijection and so is a member of the group Sx of 
all permutations of X. Moreover, if a,b € G, then axiom A2 gives 


Cab(t) = (ab) -@ =a-(b- x) = oalov(#)] = (var) (2) 


for all 2 € S. Hence, ogh = Gao» So the map 0: G-— Sx, defined by O(a) = oq for 
all a € G, is a group homomorphism. This gives parts (1) and (2) of 


8.8. Group Actions 367 


Theorem 2. Let G be a group, let X be a G-set, and let o, be defined as in (*), 
where a € G. Then 


(1) oa € Sx for alla € G. . 
(2) 6:G— Sx given by 0(a) = og is a group homomorphism. 
(3) ker@={ae€G|la-c=2 forall x € X}. 


Proof. It remains to prove (3). But a € ker@ means og = 1g,; that is, o4(v) =a 
for all 2 € X. This condition means that a- x = 2 for all x, as required. i 


If H is a subgroup of G, the extended Cayley theorem is clearly the special 
case of Theorem 2 where X = {gH|g € G} and the action of G on X is given by 
a+ (gH) =agH for all gH € X and a€ G. In this case ker @ = core H. In general, 
if X is a G-set then ker @ < G is given by 


ker? = {a€G|a-v=2z for all x € X}. 


The normal subgroup ker @ is called the fixer of the action. The reason for the name 
is that ker @ consists of the elements of G that fix every x € X. Here an element 
xz € X is said to be fixed bya € Gifa-c=xz. 


Example 7. Let G be a group and let G act on itself by conjugation: a: 2 = ara"! 


for all 2 € G and a € G. Here the bijection o, : G — G in Theorem 2 is the inner 
automorphism of G induced by a. Hence 6(G) = innG, where 0: G —- Sg is the 
homomorphism in Theorem 2, and the fixer is ker@={ae€G|a-c=z2 for all 
xz €G}= Z(G) in this case. Thus, Theorem 2 gives G/Z(G) = innG, a result 
derived earlier (Theorem 5 §2.10). 


Orbit Decomposition Theorem 


So far, the theory of group actions has been motivated by the urge to generalize 
the extended Cayley theorem. However, the theory yields an additional bonus: It 
provides a fundamental, natural generalization of the class equation. 

Recall that elements x and y in a group G are called conjugate in G if y = axa~ 
for some a € G. If we regard G as acting on itself by conjugation, this condition is 
y =a-2 for some a € G. This suggests a generalization: If X is a G-set, we define 
a relation = on X as follows. If z,y € X, we write 


1 


x=y(modG), if y=a-a forsomea€G. 


One easily verifies that = is an equivalence on X (Exercise 17). Moreover, we may 
describe the equivalence class [x] of an element x of X in terms of the action: 
[7] = {ye X | y=ac}={a-2]|a€G}. This equivalence class is called the orbit 
of « under G and is denoted 


G-2={a-xlac Gh. 
Hence, if G acts on itself by conjugation, the orbits are just the conjugacy classes. 
Cosets also occur as orbits. 


Example 8. Let H be a subgroup of a group G and let H act on G by left 
multiplication: h + «x = ha for all « € G, h € H. Then the orbit of x € G is the right 
coset Ha = H- a. 


368 8. p-Groups and the Sylow Theorems 


A key step in the derivation of the class equation is the observation that the 
number of elements in the conjugacy class of x € G is just the index in G of the 
normalizer N(x) of x in G. Surprisingly, if X is any G-set, the size of each orbit is 
the index of a certain subgroup of G. 


Lemma 3. If X is a G-set and x € X, write S(z) ={a¢G|a-c=z}. Then 
(1) S(a) is a subgroup of G for each x € X. 
(2) |G-a|=|G: S(a)| for each x € X. 


Proof. The proof of (1) is left as Exercise 23. Given « € X , write S(x) = S and 
define a function yp: G- a — {gS | g € G} by v(g- x) = gS. Then 


gee=h-¢ © (h1g)-2=2 & htgeS © gS=hS, 
so y is well defined and one-to-one. Since ¢ is clearly onto, this proves (2) because 
IG: S| =|{gS | 9 € G}]. 
If X is a G-set and z € X, the subgroup 
S(z)={aeGla-c=2} 


is called the stabilizer®! of x. If G acts on itself by conjugation the stabilizer of z 
in G is (x) = {a € G| ava = x} = N(z) is the normalizer of z, and the orbit of © 
zisG-2={gxeg' | g € G} =classz. Hence, Lemma 3 gives | class 2| = |G : N(z2)| 
in this case, a result proved earlier (Theorem 2 §8.2). 

If X is a G-set and x € X, the orbit of g is G-x = {a-x|a€ G}. Combining 
this with Lemma 3 gives equivalent conditions that the orbit is a singleton 


G-e={z} © a-c=a2, foralacG = S(a)=G. a | 
The set of all such elements x is denoted 
Xs={eeX|a-c=- for allac G} 


and is called the fixed subset of X under the action of G. With this we can give 
the promised generalization of the class equation. 


Theorem 3. Orbit Decomposition Theorem. Let a group G act on a finite set 
X #@ and let G- 21, G-a29,...,G+Xp denote the nonsingleton orbits. Then 


l= eel 1G = S(ey)], 


Proof. The fixed subset X, is the union of the singleton orbits by (**). Because 
the orbits partition X, |X| = |X ,|+ 07, |G- «|. Now apply Lemma 3. | 


Theorem 3 becomes the class equation if X = G and G acts on itself by conjugation 
because the fixed subset is Gy = {x € G| aza7! = z for alla € G} = Z(G). 

In the terminology of Theorem 3 the index |G: S(a,)| = |G - a;| is finite because 
X is a finite set and, if the group G is itself finite, it is a divisor of |G|. This 
property is particularly important when G is a finite p-group, where p is a prime, 
because then p divides |G : S(a;)| for each 7. Hence, Theorem 3 shows that p divides 
|X| — |X| in this case, which is important enough to record as Theorem 4. 


*! Another name for $(x) is the isotropy group of a. 


8.3. Group Actions 369 


Theorem 4. Let p be a prime and let G be a finite p-group. If X is a finite G-set, 
then p divides |X| — |X|. 


We use this result repeatedly in the next Section. For now we illustrate how to use 
it by proving an important property of finite p-groups that will recur in Chapter 9. 


Theorem 5. Let G be a finite p-group, where p is a prime. If H # G is a subgroup, 
then N(H) # H. 


Proof. Let X = {xH | z € G} denote the set of left cosets of H in G, and let H 
act on X by left multiplication: h. (cH) = hxH for allz € G and h€ H. Then p 
divides |X| = |G : H| and so |X| #1 by Theorem 4. Now 


Xy = {cH | heH = cH for all h € H} 
={cH |2" Hs C H} 
= {cH |x € N(H)}. 
Hence, N(H) # H since otherwise X+ = {H}, contrary to |X;| #1. @ 


We conclude this section by sketching J.H. McKay’s beautiful proof 9? of Cauchy’s 
theorem, which applies Theorem 4 and avoids proving the abelian case separately. 
If G is a group and p is a prime divisor of |G|, we must find an element of order p 
in G. McKay’s idea is to consider the set of p-tuples with product 1: 


X ={(a1,...,@p) | ai EG, @102°*+ Ap = I}. 


What is needed is a p-tuple (a,a,...,a) in X with a#1. To this end, let the 
(additive) group Zp act on X by “cycling” the p-tuples: 


he (Gry vagg) = (Gighy erp Gp; Gist yap);,. forall RS Zp. 


We leave to the reader the task of verifying that this is an action (well defined) 
and that the fixed subset is X = {(a,a,...,a) | a? = 1}. Hence, Cauchy’s theorem 
follows if |X| # 1, and this in turn holds (by Theorem 4) if p divides |X|. But this 
latter condition follows because p divides |G| and |X| = |G|?-! (indeed, in choosing 
(a1,...,d@p) in X, the elements a,...,@p)-1 can be selected arbitrarily from G). 
This completes a most elegant proof. 


Exercises 8.3 


1. (a) If |G] = 20, show that G has a normal subgroup of order 5. 
(b) If |G] = 28, show that G has a normal subgroup of order 7. 

2. If |G] = 24 and G has a subgroup of order 8, show that G is not simple. 

. If p and q are primes, show that no group of order pg is simple. 

4. Show that every group of order 15 is cyclic. [Hint: If o(2)=5 and o(b)= 3, show 
that bab“! = a* for some k. Deduce that b*ab-" = a*” for each n and hence k = 1.] 

5. If |G| = pm, where p is a prime and p > m, show that any subgroup of order p is 
normal in G. (Such subgroups exist by Cauchy’s theorem.) 

6. (a) Ifn > 5 and p¥ nis a prime, show that A, has no subgroup of index p. 
(b) If p is a prime, show that A, has a subgroup of index p. 


ow 


92 American Mathematical Monthly, Vol. 66, 1956, p. 119. 


370 


10. 


11. 


12. 


13. 


14. 


23. 


24. 


8. p-Groups and the Sylow Theorems 


. If H and K are subgroups of G, show that core(H NK) = core HM core K. 
. If G is the group of all 2 x 2 invertible matrices over R, find core H, where H is the 


0 


group of diagonal matrices [é ; 


] G. 


. Prove Lemma 1. 


If H is a subgroup of G, define Hp = [}{o(H) | o € aut G}. 

(a) Show that Hp is characteristic in G and Hp C H. 

(b) If K is characteristic in G and K C H, show that K C Hp. 

Show that the following are equivalent for a group G. 

(1) G has a nontrivial finite G-set. 

(2) G has a proper normal subgroup of finite index. 

(3) G has a proper subgroup of finite index. 

Given m > 1, show that a finitely generated group G has at most a finite number 
of subgroups of index m. [Hint: If C= {K | K =coreH where |G: H| = m}, show 
that C is a finite set and that, given K €C, there are at most a finite number of 
subgroups H with K C H.] 

Let G=(R,+) and define a-z=e'®z for all z€C and a€G. Show that C is a 
G-set, describe the action geometrically, and find all orbits and stabilizers. 

Let X =R[a1,22,...,2n], the polynomial ring in the indeterminates 21,..., 2p. 
Given o € S, and f = f(a1,...,2n) € X, define o- f = f(o1,Z02;++-;Lon). Show 
that this is an action and describe the fixer. If n = 3, give three polynomials in the 
fixer and compute S3-g and S(g), where 9(21,%2,%3) = 21 +22. 


. Write X, = {1,2,...,n}. Ifo € S,, write G = (c) and let the elements of G act on 


Xp, as mappings. Describe the relationship between the orbits of G in X, and the 
factorization of o into disjoint cycles. 


. Let @:G— Sy be a group homomorphism, where X is a nonempty set. Show that 


6 arises, as in Theorem 2, from some action of G on X. 


. Let X be a G-set. Show that 


(a) Equivalence modulo G (defined prior to Example 8) is an equivalence on X. 
(b) Every equivalence on X arises, as in (a), from some group action on X. 


. Let X be a G-set. If F is the fixer, show that X is a G/F-set in a natural way and 


that the fixer is trivial (such actions are called faithful). 


. Show that a group G acts on its set of subgroups by conjugation and that Z(G) C F, 


where F is the fixer. Give an example where Z(G) # F. 


. Is every normal subgroup of a group G the fixer of some action of G? Give reasons. 
. If H is a subgroup of G, find a G-set X and an element z € X such that H = S(z). 
. If HG, define the centralizer of H in G as C(H) = {ae G|ah=ha for all 


h € H}. Use Theorem 2 to show that C(H) <G and that G/[C(H)] is isomorphic 
to a group of automorphisms of H. 

Let X be a G-set and let 2 and y denote elements of X. 

(a) Show that S(z) is a subgroup of G. 

(b) If s € X and b € G, show that S(b- x) = b9(a)b"}. 

(c) If S(x) and S(y) are conjugate subgroups, show that |G-2| = |G-y]. 

Let X be a G-set with just one orbit (called a transitive action). 

(a) If K <a G, show that K C S(x) for some xz € X if and only if K is contained in 
the fixer. [Hint: Exercise 23.] 

(b) If |X| > 2, show that g €G exists such that g-2 #2 for all se X. [Hint: 
Exercise 10 88.2.] 


25. 


26. 
27. 
28. 


29, 


30. 


31. 


32. 
- for all x € G and (h,k) € H x K. Show 


33. 


34. 


8.4. The Sylow Theorems 371 


Let X be a G-set, let H be a subgroup of G, and let x € X. Show that: 

(a) H acts on the orbit G-a by h- (a-x) = (ha)-2 forallhe H anda-2€G-z, 
(b) If H < G, the orbits of H in G- a all have the same cardinality. 

Let G be a finite p-group. If {1} #4 H 4G, show that HN Z(G) # {1}. [Hint: Let 
G act on H by conjugation.] 

If G is a finite p-group, show that the number of nonnormal subgroups of G is a 
multiple of p. [Hint: Let G act on its subgroups by conjugation.] 

If G is a finite p-group, show that the number of subgroups of order p* is congruent 
(modulo p) to the number of normal subgroups of order p*. 

Let Hi,...,Hm be all the subgroups of index p in a finite p-group G. Show that 
K =()j2, Hi is normal in G, that G/K is abelian, and that o(x)= p for all nonunity 
elements x € G/K. [Hint: Theorem 5 and Exercise 6 §8.2.] 

Let H and K be subgroups of a group G. Show that K has |H : HN N(K)| distinct 
conjugates of the form hKh~!, where h € H. Here N(K) = Ng(K) is the normalizer. 
If H and K are finite subgroups of some group, prove that |HK|-|Hn K| =|H|-|K| 
by letting HK act on Hx K by a- (h,k) = (ha71, ak). (Hint: Show that each 
orbit has the same number of elements.] 

Let H and K be subgroups of a group G and let H x K act on G by (h,k) +2 = hak7} 


(a) This is an action and the orbit of x € G is HzK (called a double coset). 
(b) If w € G, then |S(z)| =|HnNzKa|=|r1H2rn K\. 
(c) Frobenius theorem: If Ha,K, Haak, ... ,HznK are the distinct double 


n 
cosets, then |G| = >> Gta 
i=1 —* r 


If X and Y are G-sets, a map y: X — Y is called a G-morphism if y(a+ 2) =a- y(z) 
holds for all 2 € X and a€G. If, in addition, y is a bijection, it is called a 
G-isomorphism, and X and Y are called isomorphic G-sets. Call a G-set 
transitive if there is just one orbit. If H is a subgroup of G, let G/H denote the 
G-set of left H-cosets using left multiplication. (H need not be normal.) 

(a) Show that G/H is transitive for any subgroup H of G. 

(b) Show that every transitive G-set is G-isomorphic to G/H for some subgroup H. 

Let y: X — Y be an onto G-morphism, where X and Y are G-sets (Exercise 33). 
Define a relation ~ on X by x ~ aj if p(x) = y(z1). This relation is an equivalence 
(called the kernel equivalence of y~), and we denote the equivalence class of z € X 
by [2] = {6€ X | p(t) = y(ax)}. Finally, let X/y = {[z] | x € X} denote the set of 
equivalence classes. 

(a) Show that X/y is a G-set via a- [x] = {a- a] for all [x] in X/p and alla eG. 

(b) Find a G-isomorphism X/y — Y (the G-set isomorphism theorem). 


8.4 THE SYLOW THEOREMS 


Lagrange’s theorem asserts that the order of each subgroup of a finite group G is 
a divisor of |G|. The converse is false: Ay has no subgroup of order 6 even though 
|A4| = 12. However, if p* divides the order of G where p is a prime, then G does 
have a subgroup of order p*. This remarkable theorem was first proven in 1872 (for 
permutation groups) by the Norwegian mathematician Ludwig Sylow and has been 
ranked with Lagrange’s theorem as being among the most important results about 


372 8. p-Groups and the Sylow Theorems 


finite groups. The version presented here for abstract groups was proven in 1887 by 
Georg Frobenius, and this proof uses only Cauchy’s theorem and the class equation. 
We give another more modern direct proof using the theory of group actions at the 
end of this section. 


Theorem 1. Let G be a finite group. If p is a prime and p* divides |G| for some 
k > 0, then G has a subgroup of order p*. 


Proof. Tt is clear if k = 0, so assume k > 1. Proceed by induction on |G|. The 
theorem is clear if |G] = 1,2, or 3. In general, |G| = |Z(G)| + 7, |G: N(ai)| 
by the class equation, where classa;,..., classa, are the nonsingleton conjugacy 
classes. If p* divides |N(a;)| for some i then N(a;) has a subgroup of order p* by 
induction because |N(a;)| < |G|. 

So assume that p* does not divide |N(a;)| for every 1 > 1. Because p* divides 
|G| = |N(a,)||G: N(a;)| for all i, it follows that p divides |G: N(a;)| for every 
i. Hence, p divides |Z(G)| by the class equation so, by Cauchy’s theorem, choose 
a € Z(G) with o(a) = p. If we write K = (a) then K 4 Gand |G/K| = - IG| < |G. 
Moreover, p*~! divides |G/K| because p* divides |G|. Hence, again by induction, 
G/K has a subgroup H/K with |H/K| = p*-!. As |H| = p*, we are done. @ 


Sylow originally proved Theorem 1 in the special case where p* is the highest 
power of the prime p that divides the order of the group. 


Corollary. Sylow’s First Theorem. If G is a group of order p"m, where p is a 
prime and p does not divide m, then G has a subgroup of order p”. 


If G is a group of order p”m, where p is a prime and p does not divide m, any 
subgroup of order p” is called a Sylow p-subgroup of G. 


Example 1. Write D3 = {1,a, a”, b, ba, ba”}, where o(a) = 3, o(b) = 2, and aba = b. 
Then H = {1,a,a?} is the unique Sylow 3-subgroup, but {1, b}, {1, ba}, and {1, ba?} 
are three Sylow 2-subgroups. Hence, the Sylow 2-subgroups may not be normal or 
unique. We show later that a Sylow p-subgroup is normal if and only if it is unique. 


Example 2. If G is a finite abelian group and p is a prime divisor of |G, let 
G(p) = {a € G| o(a) = p* for some k > 0}. This set is a subgroup (because G is 
abelian) and so is a p-subgroup of G that contains every p-subgroup. It is thus the 
unique Sylow p-subgroup of G, called the p-primary component of G. 


Corollary 2 of Theorem 3 §7.2 shows that every finite abelian group is isomorphic 
to the direct product of its primary components and thus of its distinct Sylow p- 
subgroups. We characterize when this happens in a nonabelian group in Section 
9.3. 

A Sylow p-subgroup P of a finite group G is a p-subgroup of G of maximum 
possible order (by Theorem 1). Note that each conjugate aPaq' of P is also a 
Sylow p-subgroup because |aPa~*| = |P|. The converse is also true: Every Sylow 
p-subgroup is conjugate to P. In fact, every p-subgroup of G is contained in a 
conjugate of P (and so is contained in a Sylow p-subgroup of G) 


Theorem 2. Let P be a Sylow p-subgroup of a finite group G. If H is any 
p-subgroup of G, then H C aPa™ for somea€ G. 


8.4. The Sylow Theorems 373 


Proof. Let X = {aP | a € G} be the set of left cosets of P in G and let H act 
on X by left multiplication: h.aP = haP for all h € H. Write |G| = p”m, where 
p does not divide m. Then |X|=|G: P| =m, so p does not divide |X|. Hence 
(because H is a p-group) Theorem 4 §8.3 shows that p does not divide |X|, where 
Xy is the fixed subset. In particular, Xf is not empty, so let aP € Xs, aE G. 
Then haP =h-aP =aP for all he H, whence a~tha€ P for all he H. Thus 
H CaPa“}, as required. | 


Taking H to be any Sylow p-subgroup of G, we obtain 


Corollary 1. Sylow’s Second Theorem. If G is a finite group, any two Sylow 
p-subgroups of G are conjugate in G. 


Because a subgroup of G is normal in G if and only if it equals all its conjugates in 
G, we get Corollary 2. 


Corollary 2. A Sylow p-subgroup of a finite group G is normal in G if and only if 
it is unique. 


Example 3. Given D3, as in Example 1, the Sylow 2-subgroups {1, b}, {1, ba}, and 
{1, ba?} must be conjugate. In fact, a{1, b}a~! = {1, ba} and a?{1, b}a~? = {1, ba?}. 


The next result identifies a fundamental technique due to Giovanni Frattini. We 
return to this in Section 9.3. 


Corollary 3. Frattini Argument. Let H <G and let P be a Sylow p-subgroup 
of H. Then we have G = HNg(P), where Ng(P) is the normalizer of P in G. 


Proof. Suppose g € G. Then gPg™! CH because HG, so gPg™! is also a 
Sylow p-subgroup of H. By Corollary 1, h(gPg~')h-1 =P for some h € H, so 
hg € Ng(P). The result follows. a 


If K 4G, Example 4, employs Theorem 2 to provide some useful information 
about the Sylow subgroups of K and G/K. 


Example 4. Let K < G, where G is a finite group, and let p be a prime. 


(1) A subgroup of K is a Sylow p-subgroup of K if and only if it has the form 
PK, where P is a Sylow p-subgroup of G. 

(2) A subgroup of G/K is a Sylow p-subgroup of G/K if and only if it has the 
form (PK)/K, where P is a Sylow p-subgroup of G. 


Solution. 
(1) We begin with one of.the implications in (1). 


Claim: If P is a Sylow p-subgroup of G then PM K is a Sylow p-subgroup of K. 


Proof. PQ K is a p-subgroup of K, so let PA K C X where X is a Sylow p-subgroup 
of K. But X is a p-subgroup of G so X C aPa™ for some a € G by Theorem 2. It 


follows that 
oP KjeCa xa C Pita *Ka)aPnk,. 


Hence, these sets are equal, and in particular PM K = X. This proves the Claim. 


Now if H is a Sylow p-subgroup of K, then H C P for some Sylow p-subgroup 
P of G. Hence, H C PNK, so H = PNK by the Claim. This proves (1). 


: 
i 
i 


874 &. p-Groups and the Sylow Theorems 


(2) Let |G| = p"m where p does not divide m. By Lagrange’s theorem, let 
|K| = p*r where k <n and r|m. Thus |< | = pr* (#). Let oS be some Sylow 
p-subgroup of g SO || =p"*. Then |H| = p"r. So if P is a Sylow p-subgroup 
of H, then P is a Sylow p-subgroup of G, and it remains to show PK = H. 


But | 54 = waa by the second isomorphism theorem (Theorem 1 §8.1), and 


|PK|=p* by (1). Hence, |4£| = p"-* so |PK| =p"r =|H|. Since PK C H, 
this gives PK = H, as required. The proof that each of the groups oS is a Sylow 
p-subgroup of & is left to the reader. oO 


The third Sylow theorem is concerned with the number of Sylow p-subgroups 
of a finite group G where p is a prime. We will denote this by np or n,(G) if the 
group must be identified. Although determining n, from the order of the group is 
not possible in general, we can deduce a good deal of numerical information. 


Theorem 3. Sylow’s Third Theorem. Let G be a group of order p™m, where 
p is a prime, n > 1, and p does not divide m. If np denotes the number of distinct 
Sylow p-subgroups of G, then: 

(1) np = 1 (modp). 

(2) np divides m. 

(3) np = |G: N(P)|, where P is any Sylow p-subgroup of G. 


Proof. By Sylow’s second theorem, (3) follows by Theorem 2 §8.2, so np divides 
|G| = p"m. Hence, (2) follows from (1) because (1) implies that gcd(p, np) = 1. 
To prove (1), let X denote the set of all Sylow p-subgroups of G so that |X| = np. 
Fix P in X and let P act on X by conjugation. If Xy is the fixed subset, then 
Np = |X| = |Xy| modulo p by Theorem 4 §8.3, so it suffices to show that Xp = {P}. 
We have Xp = {Q € X | aQa7' = Q for all a € P}, so P € Xf is clear. If QE Xz, 
then P C N(Q), so both P and Q are Sylow p-subgroups of N(Q) (they are p- 
subgroups of maximal order). But Q < N(Q), so Q = P follows by Corollary 2 of 
Theorem 2. Hence X+ = {P}, as required. | 


Examples 5-9 illustrate the power of the Sylow theorems and how to apply them 
to particular groups. 


Ezample 5. If p and qg are primes, show that no group G of order pq is simple. 


Solution. If p = q, then G is abelian by Theorem 7 §8.2, so G is not simple because 
|G| = p* is not a prime. So assume that p > gq. Then n» =1 (modp) and nylq by 
Theorem 3, so Np = 1 because g < p. Thus, there is just one Sylow p-subgroup, and 
it is normal by Corollary 2 of ‘Theorem 2. O 


Example 6. Show that every group of order 175 is abelian. 


Solution. Observe |G] = 175 = 5*-7. Then ns|7 and ns =1 (mod5) by Theorem 
3, from which ns = 1. Hence, there is just one Sylow 5-subgroup P of G and so 
P dG. Similarly n7|5? so n7 = 1,5, or 25, and n7 =1 (mod 7). Thus n7 = 1, so 
there is a unique Sylow 7-subgroup Q of G and Q < G. Now PNQ = {1} because 
gcd(|P|, |Q|) = 1, so |PQ| = |P||Q] = |G]. Thus G = PQ, so G&PxQ by Theorem 
3 §8.1. But P and Q are abelian by Theorem 7 §8.2, so G is abelian. i) 


8.4. The Sylow Theorems 375 


Example ‘7. Show that there is no simple group of order 56. 


Solution. If |G| = 56 = 23-7, then nz divides 8 and n7 =1 (mod7). This means 
that nz = 1 or n7 = 8. If n7 = 1, then the Sylow 7-subgroup is normal. If n7 = 8, 
there are eight distinct cyclic subgroups in G of order 7. Because the intersection 
of any two of these subgroups equals {1}, there are 8 - 6 = 48 elements of order 7. 
This leaves eight elements, so the Sylow 2-subgroup is unique and hence normal. 


Example 8. Show that there is no simple group of order 72. 


Solution. If |G| = 72 = 2° - 3?, then nz = 1,3, or 9 and ng = 1 or 4 by Theorem 3, 
so the method in Example 7 fails. However, let P denote any Sylow 3-subgroup of 
G. Ifng =1, then P dG. If ng = 4, then |G: N(P)| = 4 by Sylow’s third theorem. 
Thus, Theorem 1 §8.3 provides a homomorphism 6: G — S4, with ker@ C N(P), 
and ker @ # {1} because |G] = 72 does not divide |S4|= 24. As ker? dG and 
N(P) # G, we are done. 0 


A famous theorem of William Burnside asserts that no group of order p"q™ is 
simple where p and q are primes. The proof involves the theory of group represen- 
tations and is beyond the scope of this book. However, we can do the following 
case. 


Example 9. Show that no group of order pq? is simple when p and q are primes. 


Solution. Let |G| = p?q?. If p=4q, then Z(G) # {1} by Theorem 6 §8.2 and we 
are finished. So assume that p > q. We have np = 1,q, or q”, and np = 1 (modp). 
If np = 1, the Sylow p-subgroup is normal and we are done. The case np = q is 
impossible because g < p. So assume that np = q’, obtaining g? = 1 (modp). Then 
p divides g? —1= (q—1)(¢+1), so either p|(qg—1) or p|(¢+1). Because p> q, 
the first alternative is impossible; the second implies that g+ 1 > p> q from which 
q+i=p. This means that p=3 and gq = 2, so |G| = 36. But then any Sylow 
3-subgroup has index 4, so there is a homomorphism @: G— S4 by Theorem 1 
§8.3. This homomorphism cannot be one-to-one, so ker @ is normal in G. Oo 


We are now going to use the Sylow theorems to characterize the groups of order 
less than 16. It turns out that three of these groups belong to a family of groups 
that resemble the dihedral groups and are constructed in much the same way. We 
let n = 2m be an even positive integer, let w = e?™/", and let 

w 0 0 i 
A=|; o| and. Be |, AE 


Then A and B are invertible complex matrices, and we can easily verify that |A| =n, 
ABA = B, and B* = A™. One verifies that 


G = {I,A,A?,...,A"-1, B, BA,..., BA"“1} 


is a subgroup of GL2(C), and |G| = 2n because (A) has index 2 in G. As for Dn, 
we abstract this situation as follows. 
Ifn=2m, m > 1, the dicyclic group Q, is the group of order 2n presented as 


Qn = {1,a,:»> ,a"71,b, ba,--- ,ba™-1}, where o(a) = n, aba = b, and b? =a”™. 


376 8. p-Groups and the Sylow Theorems 


The condition that |Q,| = 2n amounts to the requirement that b ¢ (a). The group 
Qn is presented just like D,, except that here n = 2m must be even and. 6? =a™ 
(recall that b? = 1 in the dihedral case). Again, a*ba* = 0 for all k € Z, so 


a*h = ba-* = ba"-*, for all k € Z. 
This equation shows that (ba*)? = b? = a™ for all k so, as |a™| = 2, we get 
jba®|=4, forallkeZ 


(in contrast to |ba*| = 2 in D,,). Two of these dicyclic groups are already familiar, 
as we see in Example 10. 


Example 10. Q2 = C4 because |Q2| = 4. We claim that Q4 = Q, the quaternion 
group (Section 2.8). Indeed Q = {+1, +i, +j, +k} and, writing a = 7 and b=J, we 
have o(a) = 4, aba = b, and b? = a?. Hence Q © Qg. 


Theorem 4. Every group G of order 8 is isomorphic to one of Cg, C4 x Co, 
C2 x Co x Co, Da, or Q4 = Q. 


Proof. If G is abelian, then G is isomorphic to one of Cg, C'4 x Co, or Co x Co x Co 
by Example 8 §2.8 (or by Corollary 2 of Theorem 6 §7.2). If G is not abelian, then 
x? = 1 cannot hold for all z € G, so there exists a € G, o(a) = 4. Write K = (a). 
If b ¢ K, then G= K U Kb, and we claim that aba = b. Indeed bab € K because 
K <G, and o(bab™) = o(a) = 4. As bab“! #a because G is not abelian, we get 
bab-! = a7}; that is, aba = b. Hence, if o(b)=2 for some b¢ K, then G& Dg. 
Otherwise, o(b) = 4 for all b ¢ K. But then a? is the only element of G of order 2, 
so b? = a? for allb ¢ K. Thus G & Qu. | 


In order to determine the groups of order 12, we need Lemma 1. 
Lemma 1. The only subgroup of S,, of index 2 is An. 


Proof. If |Sp : K| = 2, then K dS, and |S,/K|=2, so 0? € K for all o € Sh. 
If o is a 3-cycle, then o? =e soa =o* € K. But A, is generated by the 3-cycles 
(Lemma 4 §2.8), so An C K. This implies that A, = K because |S, :A,|=2. @ 


Theorem 5. Every group of order 12 is isomorphic to one of Cyz2, Cg x Co, Aa, 
De, or Qs. 


Proof. Let P and Q be Sylow subgroups with |P| = 3 and |Q| = 4. If G is abelian, 
G2=PxQ, so either G2&C3 x Cy = Cig or GE=C3 xX Cox Co =Ce x Co. HG 
is nonabelian, there is a homomorphism @: G — Sq with ker 6 C P. If ker 6 = {1}, 
then G & A, by Lemma 2. So assume P < G. Similarly, we have yp: G — S3 with 
ker yy C Q. Write kery = L. Then L # {1}, and L=@Q implies that Q << G, so 
G & P x Q is abelian. Hence |L| = 2,so LDP=LxP=C. 

So let a € G have order 6 and write K = (a). Ifb ¢ K, then G= KU Kb and 
aba = b, as in Theorem 4. Finally, b? € K because |G/K| = 2, and it remains to 
show that b? = 1(G & Dg) or b? = a? (G & Qo). If b? = aor a®, then o(b) = 12 and 
G is abelian. If b? = a”, then 6? = ba? = a~*b = a*b, so b? = a4, a contradiction. 
Hence b? # a? and, similarly, b? # a*. Z 


These results, together with earlier work, enable us to describe all the groups 
of order 15 or less. If p is a prime, the only group of order p is C,. There are 


8.4. The Sylow Theorems 377 


two groups of order 2p: Cop and D, (Theorem 3 §2.6). And there are two groups 
of order p?: C,2 and Cp x Cp (Theorem 7 §8.2). We have already described the 
groups of order 8 or 12, and the only group of order 15 is Ci5 (Exercise 4). This list 
describes every group of order at most 15. The description of the groups of order 16 
is more complicated (there are 14), and the general problem of describing all groups 
of a given order is extremely difficult. 

We conclude this section with an elegant direct proof of Theorem 1. The 
argument requires a number-theoretic fact. Recall that the binomial coefficient 
(") is defined by (") = ACT forO<r<n. 


Lemma 2. Let p be a prime and let m,n, and k be positive integers. Then p” 
. 36 k 
divides m if and only if p" divides om) 


Proof. Since (° or) — won a ae it suffices to show that 


p” divides (p'm—i) <=  p” divides (p* — i) for each i= 1,2,...,p® —1. 


Observe that if p” divides (p*m — i) then n < k (otherwise p*|i). Hence, the proof 
is completed by the observation that (p*m — i) = (p*m — p*) + (p* — 4). a 


With this we can give Helmut Wielandt’s elegant proof®* of Theorem 1, which 
does not use induction or Cauchy’s theorem (and so provides another proof of 
Cauchy’s theorem). 


Proof of Theorem 1. If p* divides |G|, lei X = {U C G|U a subset, |U| = p*} and 
define an action of G on X by a-U = aU for all U in X and a in G. Given U in 
X, let S(U) = {a € G| aU = U} denote the stabilizer. Write |G| = p*m and write 
m = pw, where p does not divide w. 


Claim. V in X exists such that p"*! does not divide |G : S(V)|. 


Proof. If not, p’t+ divides the order of every orbit in X (by Lemma 3 §8.3) and 
so divides |X|. But |X| = (? oh, which means that p"*+ divides m by Lemma 3. 


Hence p|w, a contradiction. This proves the Claim. 
Now let V be as in the claim and write S = S(V). We show that |S| = p*, so S 


pithy 


is the desired subgroup of G. Now p"*? does not divide ft = by the Claim, 


from which p* does divide ||. In particular, p* < |S|. But if v € V, then Su CV 
by the definition of 5, so |S| = |Sv| < |V| = p*. Thus |.$| = p*, as required. | 


Peter Ludvig Mejdell Sylow (1832-1918) Sylow was born in Norway and spent most 
of his professional life as a high school teacher in Halden. Despite onerous teaching 
duties, he found time to study the works of Abel and, in 1862-1863, he gave lectures 
on Galois theory and permutation groups at Christiania University in Oslo. The Sylow 
theorems were published in 1872 for permutation groups (Georg Frobenius extended 
them to abstract groups in 1887). These theorems are among the most important results 


93 lot is known about the number of groups of order p” for a prime p. For example there are 
2328 groups of order 27, and 9310 groups of order 3’. See O’Brien, I.A. and Vaughan-Lee, M.R. 
Journal of Algebra 292 (2005), 243-258, for more information. 

®4Wielandt, H., Bin Bewise Fir die Existenz der Sylowgruppen, Archiv der Mathematik 10 (1959), 
401-402. 


378 8. p-Groups and the Sylow Theorems 


Ex 


TO TB w 


© w 


10. 


11. 


12. 


13. 


14. 
15. 


16. 
17. 


18. 


19. 


20. 


21. 


22. 


on finite groups. Sylow applied them to show that any equation whose Galois group has 
prime-power order is solvable in radicals. 


In addition to his study of groups, Sylow spent eight years editing the works of Abel. 
After his retirement from teaching high school, he was appointed to a chair at Christiania 
University, a position he held for the rest of his life. 


ercises 8.4 


. Find all Sylow 3-subgroups of S4 and show explicitly that all are conjugate. 

Find all Sylow 2-subgroups of D,, where n is odd, and show explicitly that all are 
conjugate. 

. If P is a Sylow p-subgroup of G, prove that it is the only Sylow p-subgroup of N(P). 

Show that every group of order 15 is cyclic. 

. Show that there is only one group of order 1001. 

Show that there are exactly two groups of order 99. 

Show that a group G is not simple if 

(a) |G| = 40 (b) |G| = 80 (c) |G| = 48 (d) |G| = 108 

Show that no group of order 520 is simple. 

Show that G has a cyclic normal subgroup of index 2 if 

(a) |G| = 70 (b) |G| = 154 (c) |G] = 30 

Show that G has a cyclic normal subgroup of index 5 if 

(a) |G| = 385 (b) |G] = 455 

(a) Show that G has a cyclic normal subgroup of index 3 if |G] = 105. 

(b) Show that G has an abelian normal subgroup of index 4 if |G| = 700. 

If |G| = pq, where p <q are primes and p does not divide g—1, show that G is 
cyclic. 

If |G| = p"m, where n > 1, p is a prime, and p > m, show that the Sylow p-subgroup 
of G is normal in G. : 

If |G| = pq, where p and g are primes, show that G is not simple. 

If P is a normal Sylow p-subgroup of a finite group G, show that P is fully invariant 
in G; that is, a(P) C P for every homomorphism a: G— G. 

Let P< H and H 4G. If P is a Sylow subgroup of G, show that P <a G. 

If P is a Sylow p-subgroup of G , show that [N(P)]/P has no element of p-power 
order except the unity. 

If P is a Sylow p-subgroup of G, let N(P) C H, H a subgroup of G. 

(a) Show that N(H)=4H. [Hint: If a€ N(H), show that aPa! CH and use 
Sylow’s second theorem.] 

(b) Show that p does not divide |G : H]. 

If N(P) =P for some Sylow p-subgroup P of G, show that N(Q)=Q for every 
Sylow p-subgroup Q of G. 

Suppose that N(P) =P for some Sylow p-subgroup of the finite group G. Show 
that G/G’ is an (abelian) p-group. [Hint: If ¢g # p is a prime divisor of |G/G’|, use 
Theorems 4 §7.2 and 8 §8.2 to find a subgroup H/G’ of index q in G/G’. If Q is a 
Sylow p-subgroup of H, show that N(Q) = Q by Exercise 18 and apply Exercise 19.] 

Let K denote the intersection of all the Sylow p-subgroups of a finite group G. Show 
that K is a normal p-subgroup of G.that contains every normal p-subgroup of G. 

If m = 2m and m > 2, show that Z7(Q,,) = {1,a™}. 


8.5. Semidirect Products 379 


23. If k|n, k > 4, and k is even, show that Q,, has a subgroup isomorphic-to Qy. 

24. Ifk|n, k is even, and n/k is odd, show that K <Q, exists such that Q,/K & Qx. 

25. Show that, if G is a nonabelian group and 1 < |G| < 60, then G is not simple. (Of 
course, |As| = 60 and As is simple). 


8.5 SEMIDIRECT PRODUCTS 


There is no doubt that forming direct products is an important method of con- 
structing groups using smaller groups: For example, all finite abelian groups can 
be constructed by forming direct products of cyclic groups. In this brief section, we 
describe a more general way to form a group from smaller groups. 

Let K and H be subgroups of a group G and assume that G = KH, where 
KOH = {1}. If both K dG and H dG then G&= K x H by Theorem 3 §8.1. In 
view of this, a natural question is what happens if only one of K and H is normal 
in G, say K dG. Since G= KH and KNH = {1}, each element g € G has a 
unique representation as g = kh where k € K and h & H. Given another element 
g1 = ky hy of G, the key to understanding the group G is to describe the product 
ggi in the same form. This can be accomplished as follows: 


991 = khkyhy = [k(hkih~*)](hha), (*) 


where hkyh~! € K because K <1 G. 

To describe this more formally, let a € G and define a map o,: K > K by 
oa(k) = aka for all k € K. This makes sense because K << G, and og is an 
automorphism of K for each a € G. Hence (*) becomes 


(kh) (Kiki) = [kon (k1)](hA1). fa) 


Now observe that the map 6: H — aut(K) given by 6(h) =o, is a group 
homomorphism.®° This provides a way to turn this around: Starting with K, 
H, and 0, we can recreate G, using (**) to motivate the multiplication. 


Theorem 1. Let K and H be groups and let 0: H — aut K be a homomorphism. 
Write 0(h) =o» for all h € H. If G=K x H is the cartesian product, define an 
operation on G as follows: 


(k, h)(ki, hi) = (Kon(ki),hh1), for all (k,h) and (ky, hi) in G. 

Write K, = K x1 and H; =1 x H. Then: 

(1) G is a group using this operation with unity (1x,1x) and inverses given 

by (kh hy* = (Gtk +), A), 

(2) Then Ky and H; are subgroups of G, with K; = K and H, = H. 

(3) G= ki, Kyi dG and kiN A, = {i}. 
Proof. (2) and (3) are routine verifications. As to (1), the operation is associative 
because the products [(k, h)(k1, h1)](ke, hg) and (k, h)[(ki, hy) (ke, he)] each simplify 
to (kon(ki)ohn, (ke), hhihe). The verification of this, and of the rest of (1), is left 
to the reader. El 


°5This amounts to saying that a: k = 0,(k) is an action of the group G on K. Such actions were 
studied in Section 8.3, where K was only required to be a set. 


380 8. p-Groups and the Sylow Theorems 


If K and H are groups and 6: H — aut K is a homomorphism, the group G 
constructed in Theorem 1 is called the semidirect product of K by H, and is 
denoted K xg H. Theorem 1, and the discussion preceding it, give an important 
characterization of semidirect products (often taken as the definition). 


Theorem 2. Let G be a group. 
(1) G is a semidirect product if and only if it has subgroups K and H with 
G=KH, K <dG and KN = {I}. 
(2) In that case, G = K x9 H for some homomorphism 6 : H — aut K. Indeed, 
O(h) is (the restriction to K of) conjugation by h for each h € H. 


Example 1. A direct product K x H is a semidirect product (let 6: H — aut K 
be the trivial homomorphism: 6(h) = 1x for each h € H). 


Example 2. The dihedral group D, = {1,a,...,a"1,b,ba,...,ba" 1} is given 
by o(a) =n, 0o(b) = 2 and aba = b. If we write K = (a) and H = (b) then it is clear 
that D, = KH, K < D,, (it has index 2) and KN H = {1}, so 


Dn = Cn Xe Co for some 6: Cy > aut Cy. 


In fact, the multiplication in D,, is given by a*b = ba-* for all k, so ba*b-! = (a*)-}, 
Hence @ : Cz — aut Cn, where 0(b) is the automorphism 2 +> x71. 


It is interesting to observe that Dg = C3 xg Cz by Example 2, and Cg =C3 x¢ Co 
by Example 1. Hence, a semidirect product K x» H is not uniquely determined by 
K and H; the homomorphism @ must be specified. 


The next result determines all groups of order pg where p and qg are primes. The 
theorem illustrates the way that semidirect products can be used to give the detailed 
structure of all groups of a given order, and it extends the theorem (Theorem 3 §2.6) 
that every group of order 2p is either cyclic or dihedral. 


Theorem 3. Let G be a group of order pq where p < q are primes. 

(1) Ifp =q then G is cyclic or G = Cy x Cp. 

(2) Ifp <q and gq #1 (mod p) then G is cyclic. 

(3) If p<q and g=1 (mod p) then either G is cyclic or G = (a,b) where 
o(a) =q, o(b) =p and ab=ba™, and where 1<m<q-—1 and m?=1 (mod 4q). 
[Here, all choices for m result in isomorphic groups.]| 


Proof. By Cauchy’s theorem, choose a,b € G such that o(a) = q and o(b) = p, and 
write K = (a) and H = (b). Then K dG by the Corollary to Theorem 1 §8.3. 
Clearly KN H = {1}, so G = KZ and it follows that G is a semidirect product by 
Theorem 2. : 

(1) This is Theorem 7 §8.2. 

(2) As usual, let np denote the number of Sylow p-subgroups of G. Then Sylow’s 
third theorem (Theorem 3 §8.4) gives np = 1 (mod p) and ny, | q. Hence if ¢g #1 
(mod p) then n, = 1,80 H dG, and we haveG2@K x H=C, x Cp = Coq. 

(3) So assume gq = 1 (mod p). Since K = (a) 4G let 


b-1ab = a® where x € Z. (4) 


8.5. Semidirect Products 381 


Then b-2ab? = b-1(a®)b = (b-ab)® = a®’. Continuing in this way gives 
b-Fabk = at, for any k> 1. 


But b? =1 so this gives a® =b-PabP =a, and hence x? =1 (mod q). Since 
p | (- -1) = [2a let m € Z, have order p (by Cauchy’s theorem). Then x = 1, 
m, m? are all solutions to z? = 1 (mod gq). Moreover, there are no other 
solittions ee x? ~1=0 has at most p roots in Z,. If x = 1 in (*) then ba = ab 
so G is abelian, and hence cyclic. If x =m in (*) then b-1ab = a™ so ab = ba™ 
and we have the situation in (3). 

Finally, to realize the other solutions z =m” where 1 <r <p —1, we java 
the generators of G. If we put b; = b” then o(bi) = pand H = (b1) because > o(b) = 
is a prime. Hence G = (a,b;). Furthermore, by lab, = b-" ab" = a®” = a™", so this 
construction realizes the solution when x = m” in (*). | 


Other such theorems are possible, but we conclude by identifying one quite 
general situation in which a finite group G is necessarily a semidirect product, 
namely if G has a normal subgroup K and |K| and |G/K| are relatively prime. The 
next result of Issai Schur deals with the case when K is abelian. 


Theorem 4. Schur’s Theorem. Suppose G has an abelian normal subgroup K, 
where |K| and |G/K| are relatively prime. Then G has a subgroup of order |G/K|. 


Proof. Put |K| =m and |G/K|=n. In each coset a of G/K select an element 
g, € G and assume that g; = 1, where 1 denotes the unity of G/K. If a,b € G/K 
then gagp = gabk(a,b) for a uniquely determined element k(a,b) € K. If ga(gpgc) = 
(ga9b)9c is written out it follows that 

k(a,bc) k(b,c) = k(ab,c) [g¢*k(a,b) gc]. (*) 


Now write k(b) = [Jseq/x k(a,b). If the product of each side of equation (*) is 
taken as a ranges over G/K, we obtain (since K is abelian) 
k(be)(k(b,c)]” = k(c)[gz*&(b) gc]. oy 
Now let nn’ =1 (mod m) and write k(a)-" = k, for all a € G/K. Raising both 
sides of (**) to the power —n’ yields 
kpck(b,c)"* = ke[ge*ke 9c]: cre) 
Finally, write H = {gak, |ja€ G/K}, Then (***) gives 
(gokp)(Gcke) = [9oGe]l9c*keGe] Ke 
= [go-k(b, c)] [kot kpck(b, c)~*]ke 
= Iockve- 


This means that H is a subgroup and that the map a} gak, is an onto 
homomorphism G/K — H. It is one-to-one because gaka = gpkp implies that 
Io 19, = kpky! € K, so ga = gp by the choice of these elements. | 


Along with I. Schur, H. Zassenhaus is also credited with the general version of 


the next theorem. 


®6The result was credited to Schur by Zassenhaus in his book The Theory of Groups, 2nd English 
Edition, Chelsea, 1958. 


382 8. p-Groups and the Sylow Theorems 


Theorem 5. Schur-Zassenhaus Theorem. Let G be a group of order kn where 
k and n are relatively prime. Assume that G has a normal subgroup K of order k. 
Then G has a subgroup H of order n, and so is a semidirect product K xg H. 


Proof. It suffices to show that H exists (then K MH = {1} because the orders are 
relatively prime, and hence G = KH. We may assume that k > 1. The proof is by 
induction on |G]. 


Case 1. K contains a proper subgroup M such that M <4 G. 


Write |M| =m. Then IG/M| = En, K/M <G/M, and |K/M|=#, so G/M 
has a subgroup L/M of order n (by induction). But then |Z| = mn, M < L and 
|M|=m, so L contains a subgroup of order n (again by induction). Hence, the 
theorem is proved in this case. 


Case 2. K contains no proper subgroup that is normal in G. 


In this case let P be a Sylow p-subgroup of K, and let N = N(P) be the normalizer 
of P in G. Then P is a Sylow p-subgroup of G (because gced(k,n) = 1), and so has 
|G: N| conjugates in G. These are all in K so, as NN K is the normalizer of P in K, 


: |G] 
we obtain |K : (NN K)|=|G: N|. Hence = Kit» 80 [G| = fA = [NK]. 


Consequently NK = G and so N/(NO K) & G/K has order n. 


If it happens that N #G this shows that N contains a subgroup of order n 
(by induction), and we are done. So assume that N = G. This means that P <4 G, 
and hence that Z <1 G where Z is the center of P (being characteristic in P, see 
Corollary 3 of Theorem 3 §2.8). Since Z #1 by Theorem 6 §8.2, it follows that 
Z = K (we are in Case 2). Thus, K is abelian and we are done by Schur’s theorem 
(Theorem 4). | 


Exercises 8.5 


1. (a) Show that S,2 A, x@Co for some 0. 
(b) Show that the following is false: If K is a maximal subgroup that is normal, in 
G, then G = Kx@H for some H and 6. [Hint: The quaternion group.] 

2. Find all groups of order 55. 

. Find all groups of order 39. 

4, Show that there are two nonisomorphic groups of order 105. [Hint: Exercise 11(a) 
§8.4,] 

5. Show that there are four nonisomorphic groups of order 30: C39, Dis, DsxC3, and 
D3xCs. [Hint: Find a normal subgroup of index 2.] 

6. Let a:G— H be a group homomorphism, and write ker (a) = K. If there exists 
8: H—G such that af = 14, show that G & KxoH for some 0. 


es) 


8.6 AN APPLICATION TO COMBINATORICS 


A main theme in this chapter has been to apply counting arguments to gain infor- 
mation about a finite group G by defining G-sets and using the orbit decomposition 
theorem. In this section, we turn this successful technique around and use the group 
to gain information about the sets it acts on. Specifically, we get a formula for the 


8.6. An Application to Combinatorics 383 


number of distinct orbits, which is useful in solving certain combinatorial problems. 
We begin by deriving this formula that is of interest in its own right, and then 
describing how it applies to combinatorics. 

If X is any G-set and x € X, the stabilizer of x in X is the subgroup 


S(z)={aeGla-rz=2} 
of all elements of G that fix x. Dually, if a € G we write 
F(a)={r#eEX |a-cx=2}, 


the set of elements of X fixed by a. We refer to these sets frequently because of the 
following result of Cauchy and Frobenius. 


Theorem 1. Cauchy-Frobenius Lemma.” Let X be a G-set and assume that 
G and X are finite. If n is the number of distinct orbits of G in X, then 


1 
n= aE IFC) 


Proof. The proof proceeds by the time-honored method of counting the elements 
of a set Y in two ways and equating the results. In this case, consider the subset 
Y ={(a,z)|cvEX,a€G,a-c=2} of Gx X. Then 

IYl= 0 |F(@)| ey 

acG 
because, for each element a of G, there are exactly |F(a)| pairs in Y with first 
component a. In the same way, we obtain 
IYl= 2 IS(@)I.- 
vex 

However, we can refine this second sum because X is partitioned into orbits by the 
action of G. If G-a,,...,@-+apy are the n distinct orbits, then each z € X belongs 
to exactly one orbit G+ x;, so 


IY] = | 3 sto (*) 


t=1 | cEG-a; 


Now recall that |G - «| = |G: S(a)| holds for all « €¢ X (Lemma 3 §8.3). Ifa € G: a, 
then G- x = G+ a; and so |S(zx)| = |S(a,)|. Hence (**) becomes 


IY|= s| ye |s(a)| 


i=1 | vEG-a; 


Combining this with (*) gives n|G| = |Y| = 5° |F(a)|. The lemma follows. a 
acG 


nm 


~ s [IG 2i||$(aa)|] = 35 |G] = nICl. 


i=1 


As an illustration, let G act on itself by conjugation, so the orbits are the conjugacy 
classes. If a € G, then F(a) = {x € G| axa“! = x} = N(a) is the normalizer of a 
in G. Thus, the Cauchy—Frobenius lemma gives the following Corollary. 


Corollary. A finite group G has (1/|G|) >> |N(a)| distinct conjugacy classes. 
aceG 


Before applying the Cauchy—Frobenius lemma, we must consider a technicality. 
If Gis a group and X is a nonempty set, a function X x G— X, written (x, g)->2- 9, 


97The lemma was known to Cauchy and Frobenius in the mid-1800s, and was rediscovered by 
William Burnside in 1900. Hence, it is also called Burnside’s lemma. 


384 &. p-Groups and the Sylow Theorems 


is called a right action of G on X if f-1=2 and (z-a)-b =a: (ab) hold for all 
a,b€G and all « € X. Clearly, all the results for G-sets can be proved for right 
actions. In fact, we can easily verify that ax z= 2-a™! defines a (left) action of 
F on X. The reason for mentioning this is that right actions occur naturally in the 
examples that follow. 

The combinatorial applications in which we are interested can all be described 
using the following format. Let D and C be nonempty finite sets and let C? 
denote®® the set of all mappings \: D — C. Suppose that G is a subgroup of Sp. 


Given o € G and \ € C”, we have D > D a, C, so it is natural to define 
+o = Ao = the composition of the maps. 


This is a right action of G on the permutation group C? (the axioms are elementary 
properties of composition of mappings), and it plays a central role in our discussion. 
Example 1 is typical. 


Example 1. If g colors are available, find the number of ways in which a pyramid 
can be painted if the edges of the base are all of length 1 and the sides are of length 
2. Assume that each face is painted a single color. 


Solution. Label the faces 1,2,3, and 4 as 
shown in the figure at the right. Then the 
labeled pyramid can be colored in g* ways 
because there are g color choices for each 
face. The problem is that many of these 
colorings are indistinguishable when the 
labels are removed. The reason is that one 
labeled coloring may be carried to another 
by a motion of the pyramid, so both result 
in the same unlabeled coloring. To make 
this more precise, let 

D = {1,2,3, 4} and OC =the set of g colors. 


Then each map A: D — C determines a labeled coloring, the color of face 7 being 
A(t). Conversely, each labeled coloring determines such a map, so we may identify 
C” with the set of labeled colorings. Now let G C Sp = S4 be the group of motions 
of the pyramid, where a motion is identified with the permutation of the face labels 
that it induces. Then G acts on C” on the right as discussed previously, and we 
claim that the unlabeled colorings can be identified with the orbits of G in the set 
C? of labeled colorings. Indeed, if \ and pu are labeled colorings in C?, then 


A and js lead to indistinguishable colorings when the labels are removed; 
< is achieved by first moving the pyramid and then applying p; 
= = po for some o € G; 
<= \ and p are in the same G-orbit. 


Hence, the number of unlabeled colorings is equal to the number of orbits, so the 
Cauchy—Frobenius lemma applies. In this case G = {e,0,07}, where a = (2 3 4), 
We have F(e) = C?, so |F(e)| = q*. Next, 


F(o) = {A | Xo = A} = {A | A(2) = A(3) = A(A)}- 


°8This exponential notation is used because |C?| = |C|!?I. 


8.6. An Application to Combinatorics 385 


Hence a coloring A is in F(c) just when sides 2,3, and 4 are all the same color, 
We may choose this color in g ways and color the base in q ways, so |F(c)| = q?. 
Similarly, |F(o?)| = q?, so the number of orbits is $(q4+2q?) by Theorem 1. 


The technique used in Example 1 can be used in the same way to count the 
number of ways to color the edges or vertices of a figure. In general, we label the 
objects to be colored as 1,2,3,...,n. The group G is the subgroup of S,, consisting 
of all permutations of these objects resulting from a rigid motion of the figure. We 
then identify the colorings with the set C? of all mappings from D = {1,2,... ,n} 
to the set C of colors. If X is such a map and o € G, the map Ao colors object i 
the same as the map A colors object a(i). As o is a motion of the figure, the results 
are indistinguishable when the labels are removed, so the number of distinguishable 
unlabeled colorings equals the number of orbits (as in Example 1). Hence, the 
Cauchy—Frobenius lemma applies. 

Before giving more examples, we describe a convenient way to compute |F'(o)| 
in the Cauchy—Frobenius lemma, where o € S,, D = {1,2,...,n}, and S,, acts on 
C?, as before. If o is factored into disjoint cycles, we customarily ignore a cycle 
(k) of length 1 because o fixes k. However, our present purpose requires that we 
include such cycles. 

For example, if n = 7, we now think of o= (1 4)(3 5 7) in S7 as a product 
of four disjoint cycles: ¢ = (1 4)(2)(3 5 7)(6). If q colors are available in C, we 
claim that |F(c)| = q*. Indeed, given \: D > C, we have 


NE F(c) & AT=A & XI) =A(A) and A(3) = A(5) = A(7), 


so there are q choices for each of the colors \(1) = A(4), A(2), A(3) = A(5) = AC), 
and \(6) and hence g* possibilities for the map A. 

The obvious generalization is valid. If o € S,, then |F(c)| = 9°, where c is the 
number of cycles in the factorization in S,, of o into disjoint cycles (including cycles 
of length 1). The integer c is called the cycle index of o and is denoted c = cyca. 
We record this as Theorem 2. 


Theorem 2. Let C be a set of g colors and let S, act on C? by composition of 
maps, where D = {1,2,...,n}. Then 


|F(o)| = q°°° for any a € Sy. 
If G is a subgroup of S,,, the number of orbits of G in C” is 
(Q/IGl) Veea *: 


Example 2. Suppose that a chemical molecule is modeled in the form of an equi- 
lateral triangle with the atoms at the vertices as shown in the figure below. If q¢ 
colors are available and each atom is painted a single color, how many distinct ways 
can the molecule be colored? (The edges are not painted.) 


Solution. Here the three vertices, labeled 1,2, 2 

and 3, are permuted by motions in S3. Because 

of the high degree of symmetry of the equilat- 

eral triangle, every permutation in S3 can be 

achieved by a motion, so S3 is the group of 1 3 
motions. By Theorem 2 we get 


By Theorem 1 there are }(g? + 2g + 3q”) = 9(q¢ + 1)(q + 2) colorings (orbits). D 


386 8. p-Groups and the Sylow Theorems 


IF@)| = @, 
|F(1 2 3)| IF(1 3 2)[=4, . 
|F(1 2)| |F(1 3)|=|F(2 3)|=¢’. 


I 


In Example 3, we vary the theme by insisting that no color is repeated. This 
amounts to labeling the various facets of the object with distinct colors. 


Example 3. Suppose that: children’s blocks are to be constructed as cubes with 
each of the six faces painted a different color. If g >6 colors are available, how 
many distinct blocks can be made? 


Solution. Let D = {1,2,3,4,5,6} and let C be the set of g colors, as before. 
Because the faces are distinct colors, a coloring in this case is a one-to-one mapping 
\:D-—-C. Let X CCP denote the set of all such mappings. If G is the group 
of motions of the cube, G acts on X by composition because Ao is one-to-one 
whenever 0 € G and XE X. If oe in G, then F(c) = {A | Ao = A} is empty. 
(If \ € F(c), then o(i) = 9 implies that A(j) = A[o(2)] = A(z), from which i = 7). 
Thus |F'(c)| = 0 if o #€, whereas |F'(e)| = ¢!/[(¢ — 6)!] because F(e) = X. Hence, 
the Cauchy—Frobenius lemma gives the number of colorings as g!/{|G|(q — 6)!], so 
it remains only to compute |G|. Label the faces of the cube 1,2,3,4,5, and 6. If 
we initially place the cube with side 1 on top, we determine a motion by choosing 
which side ends up on top (six choices) and then choosing one of four rotations 
fixing the top and bottom faces (four choices). Thus, there are 6 - 4 = 24 choices in 
all, so |G| = 24 and there are q!/(24(q — 6)!] possible blocks. If g = 6 (the minimal 
number of colors), there are 6!/‘24 = 30 possible blocks. O 


The argument in Example 3 gives the general result in Theorem 3. 


Theorem 3. Let D = {1,2,...,n}, let C be a set of g > n colors, and let X C CP 
denote the set of one-to-one mappings D — C. If G is a subgroup of S,,, then G 
acts on X by composition of maps, and the number of orbits is q!/{|G|(q — n)!]. 


Needless to say, this theory has been developed further, and these examples 
provide only a glimpse of the possibilities. For example, we could ask how many 
ways a cube can be painted with g colors when exactly two faces are red or when 
at least two faces are red. In 1937, George Polya answered such questions, and 
many others, by giving an elegant and comprehensive generalization of the Cauchy— 
Frobenius lemma.°? This is beyond the scope of this book. 


Exercises 8.6 


1. If H is a subgroup of a finite group G, use the Cauchy—Frobenius lemma to compute 
the number of distinct right cosets of H in G. 

2. Verify the Corollary to the Cauchy—Frobenius lemma when G = S3. 

3. (a) If q colors are available, show that there are $q’(q + 1) ways to paint the vertices 
of an isosceles triangle (not equilateral). 
(b) Derive the formula in (a) by using elementary counting methods. 


°°For an exposition of Polya’s theory (by N. G. de Bruijn), see Beckenbach, E., ed., Applied 
Combinatorial Mathematics, New York: Wiley, 1964. Another good treatment appears in Roberts, 
F.S., Applied Combinatorics, Englewood Cliffs, NJ. Prentice-Hall, 1984, Chapter 7. 


10. 


11. 


12. 


8.6. An Application to Combinatorics 387 


(a) If g colors are available, show that there are #,q°(q? + 11) ways to paint the faces 
of a tetrahedron (four faces, each an equilateral triangle). [Hint: Example 3 §2.7,] 
(b} Repeat (a) if g > 4 and no two faces are the same color. 

(a) If g colors are available, show that there are $q°(q + 1)(q? — q +4) ways to paint 
the faces of a rectangular solid with square ends (not a cube). 

(b) Repeat (a) if g > 6 and no two faces are the same color. 

If q colors are available, how many ways can 

(a) the vertices of a tetrahedron be painted? 

(b) the edges of a tetrahedron be painted? 

How many ways can the faces of a cube be painted with g colors? [Hint: The group 
G of motions has |G| = 24. Here G consists of e and various rotations: nine about 
a line through the centers of opposite faces, six about a line through the centers of 
opposite edges, and eight about a line through opposite vertices.] 


. (a) A circular disk is divided into six equal 


sections, as shown in the figure at the right. If q 
colors are available, how many ways can one side 
of the disk be painted if each section is painted 
a single color? How many if no two sections are 
the same color. 

(b) Repeat (a) if the sections are made of 
transparent glass and the circle can be turned 
over, 


Show that there are $[g" + ql("+1)/2l] ways to make a rectangular necktie with n 
stripes if there are qg colors. (Here |k| denotes the greatest integer < k.) 


Assume that qg colors are available for painting the vertices and r colors are avail- 
able for painting the edges of an equilateral triangle. Show that there are ear(qr + 
1)(gr +2) ways to paint both edges and vertices. [Hint: A motion o of the triangle 
induces a permutation a, of the vertices and a permutation o, of the edges. Let a 
act in the obvious way on pairs (A, 2), where A and p are vertex and edge colorings, 
respectively. ] 

Repeat Exercise 10 with a planar figure as shown 

at the right, where the four outer edges have the 

same length and the inner edge is shorter. 


If G is a finite group, let p(G) denote the probability that ab = ba, where a and b 
are selected at random (with replacement) from G. 

(a) Show that p(G) = [k(G)]/|G|, where k(G) is the number of distinct conjugacy 
classes of G. 

(b) Show that p(G) < 2 if G is nonabelian, with equality for a suitable group G of 
order 8. 


Chapter 9 


Series of Subgroups 


In the future, as in the past, the great ideas must be simplifying ideas. 
—André Weil 


If G is a finite abelian group, it can be shown that G is isomorphic to a direct 
product of cyclic groups (see Chapter 7). This result is an example of a structure 
theorem, that is, a theorem showing that every group in a suitable defined class may 
be constructed in a systematic way from well understood groups in the class. Such 
theorems are hard to come by, and the result for finite abelian groups is a stunning 
example. The structure of nonabelian finite groups is much more complicated. 

Suppose that groups K and H are given. It is a very difficult problem to describe 
all groups G that have a normal subgroup Ky isomorphic to K such that G/Ky is 
isomorphic to H. If we could solve this extension problem, the solution would give an 
inductive method for constructing all finite groups. Direct and semidirect products 
solve this problem in very special cases. Although the general problem is far from 
being solved, the classes of groups that can be built up this way are of interest. 

To illustrate, suppose that we use only abelian groups as building blocks. Start- 
ing with an abelian group Go, we construct G, D Go such that Go <1 G; and Gi/Go 
is abelian. Next, we extend G, to obtain Gg D G; such that G, < Gp and G2/G) 
is abelian. After n steps, we have a chain 


G=G,>D Gn-1 D+: DGi DG, 


where Gj d Gi4i1 and Gi4i/G; is abelian for each 7. Such a group G is called 
solvable, and the theory of these groups is successful in the following sense: The 
class of solvable groups is large (it contains all finite groups of odd order), but at 
the same time, many theorems are true for all solvable groups but do not hold in 
general. We investigate solvable groups in Section 9.2. 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


388 


9.1. The Jordan—Hélder Theorem 389 


If we use simple groups as building blocks in this way, the resulting groups are 
those studied in Section 9.1. In this case, the above chain of subgroups is called 
a composition series for G, and the famous Jordan—Hélder theorem asserts that G 
uniquely determines the series of groups Gn/Gn-1, Gn-1/Gn-2, .--, Gi/Go, Go. 
This leads to the useful notion of the composition length of a group. 

Section 9.3 deals with somewhat more specialized central series and begins 
the study of finite nilpotent groups. These groups are characterized as the groups 
that are the direct product of their Sylow subgroups, equivalently if every Sylow 
subgroup is normal. In addition, the Frattini subgroup is defined for every finite 
group and shown to be nilpotent. 


9.1 THE JORDAN-HOLDER THEOREM 


Much of what we do in this chapter is concerned with groups G that admit a chain 
of subgroups with certain nice properties. A subnormal series for G is a chain 


G=Go 2G, 2G2.2:::2G,= {1} 


of subgroups of G such that Gi1 < G; for each i. The factor groups G;/Gj41 are 
called the factors of the subnormal series. Note that we do not insist that the 
subgroups G; are normal in G. Moreover, by possibly deleting some of the groups 
G;, we clearly may assume that G; # Gj41 for each 7. 


Example 1. GD {1} is a subnormal series for any group G. The only factor is G. 


Example 2. As>KDHDeE is a subnormal series for Ay, we have 
K = {e,(1 2)(3 4), (1 3)(2 4), (1 4)Q 3)} and H={e, (1 2)(3 4)}. The 
factors are C3, C2, and C2 in order. Note that H is not normal in Ay. 


The most interesting cases are the groups that admit a subnormal series in 
which every factor is abelian or every factor is simple. We investigate the first case 
in Section 9.2; the second calls for another definition. 

If G is a group, a subnormal series G = Go D G1 D+: D Gn = {1} is called 
a composition series for G if each factor G;/G,+1 is simple. In this case, the 
factors G;/Gj41 are called composition factors of G, and the integer n is called 
the length of the composition series. If G = {1}, we say that G has a composition 
series of length 0. 


Example 3. The simple groups are those with a composition series of length 1. 


Example 4. Every finite group G has a composition series. This holds by definition 
if G = {1}. If G ¥ {1}, write G = Go and choose a maximal normal subgroup G1 
of Go (it exists because G is finite). Then Go/G is simple by Theorem 6 §8.1. If 
G, # {1}, choose a maximal normal subgroup G2 of G; and continue in this way. 
The series G = Gp D G1 D Gp D ++ must reach {1} eventually because G is finite, 
so it is a composition series. 


The converse of Example 4 is false: Any infinite simple group has a composition 
series of length 1. However, the converse does hold for abelian groups. 


Example 5. An abelian group G has a composition series if and only if G is finite. 
Indeed, if G = Gp D G1 D-:: D G, = {1} is a composition series, each composition 


390 9. Series of Subgroups 


factor G;/Gi+1 is a simple abelian group and so is finite. Hence (Exercise 11), 
IG| = |G/G4||G1/Ge|---|Gn-1/Gr| is also finite. 


The finite abelian groups are not the only ones having all composition factors 
abelian. Theorem 8 §8.2 shows that every finite p-group has this property: 


Example 6. If p is a prime, each finite p-group has a composition series in which 
every composition factor is isomorphic to Cp. 


A group may have several different composition series. For example, let G be a 
cyclic group of order 12 and, for each divisor d of 12, let Hg be the unique subgroup 
of G of order d. Then G has three composition series. These series, along with their 
composition factors, are 


GD He > H3> {1}, Factors: Co, Co, C3, 
GD Hg > H2D {1}, Factors: Co, C3, Ca, 
GD H,> H2> {1}, Factors: C3, Co, Co. 


Note that the length is 3 in each case and the factors are also the same except 
for the order in which they appear. Hence, these series are all equivalent in the 
following sense: Two composition series for a group are said to be equivalent if 
they have the same length and the composition factors can be paired in such a way 
that corresponding factors are isomorphic. This is clearly an equivalence relation 
on the set of all composition series of a group G (assuming that there is one). The 
remarkable thing is that any two composition series for G are equivalent. This is 
the most important theorem in this section. 


Theorem 1. Jordan—Hoélder Theorem. If a group has a composition series, any 
two composition series are equivalent. 


We will give the proof at the end of this section. 

If a group G has a composition series, it uniquely determines the length of 
the series and the composition factors (including multiplicities), Hence, we can 
speak of the composition length of the group G, denoted length G, and of the 
composition factors of G. 

Composition series were first discussed by Camille Jordan. In 1869, he showed 
(for groups of permutations) that the orders of the composition factors are the same 
for every composition series of the group. However, it was not until 20 years later, 
after the abstract definition of a group had been given, that Otto Holder observed 
that the group uniquely determines the composition factors themselves and that 
they are the same in any composition series. 


Example 7. If n> 5, then S, D As D {e} is a composition series because A, is 
simple (Theorem 8 §2.8). Hence, 5S, has length 2 and the composition factors are 
Cp = S;,/A, and A,. If n = 4, we get a composition series 

SsD Aad KD HD {é}, 


where K = {e, (1 2)(3 4), (1 3) 4), (1 4)(2 3)} and H= {e, (1 2)(3 4)}. 
Hence, S4 has length 4 and the composition factors are C2, C’3, Cz, and Co. 


The Jordan—Hélder theorem is a type of unique factorization theorem. In the 
following corollary, we use it to give another proof that the factorization of an integer 
n into primes is unique. Here, the composition factors play the role of primes. 


9.1. The Jordan—Holder Theorem 391 


Corollary. The factorization of an integer n > 2 into primes is unique. 


Proof. Let n= p1p2:::pr, where p; are (not necessarily distinct) primes. If 
G = (g) is a cyclic group of order n, then 


G= (9) 3 (g??) 5 (gPiP2\ ee ee (gPiP2'"Pr \ es {1} 


is a composition series for G because the factors are Cp,,Cp,,°++ ,Cp,- (Indeed, 
o(g) =n and o(g?!P2"Pk) = ppi1+++ pr for each k by Theorem 5 §2.4.) Since any 
other factorization of n into primes must yield the same composition factors, it 
follows that n uniquely determines the number r = lengthG and the primes p;. @ 


If K dG and G/K & H, the group G is called an extension of K by H. So if 
G=G) 2G, 5D---DG,= {1} 


is a composition series for G, then each G; is an extension of G;41 by a simple group. 
Thus, each finite group G is the result of a finite number of extensions by finite 
simple groups, and the Jordan—Holder theorem shows that G uniquely determines 
the simple groups used (up to order). Moreover, we know all the finite simple groups 
(see the discussion at the end of Section 2.8), so the complete description of all finite 
groups comes down to the extension problem: For a simple group H, describe 
all extensions of a given group G by H. This is a very difficult task. 

We are going to prove that subgroups and homomorphic images of groups with 
a composition series again have composition series. The proof requires the follow- 
ing technical lemma that gives important information about subnormal series in 
general and will be referred to again later. For composition series, we use 
it to deduce some important properties of the length of a group and prove the 
Jordan—Ho6lder theorem itself. 


Lemma 1. Let G= Go DG; D--: 2D G, = {1} be a subnormal series for the 
group G and let K 4G. 

(1) K=KNGoDKNG, D--- 2D KNG, = {1} is a subnormal series for K, 
and the factor (IX M Gi)/(K M Gi41) is isomorphic to a normal subgroup of 
G;/Gi41 for each 1. 

(2) G/K = (KG)/K D (KG))/K D.---D (KG,)/K = {K} is a subnormal 
series for G/K, and the factor [(IG;)/K]/[(KGi41)/K] is a homomorphic 
image of G;/Gi4.1 for each i. 


Proof. We leave to the reader the verification that the series are subnormal. 

(1) Define a: KNG; > Gi/Giy1 by a(x) = 2Gi41. This is clearly a group 
homomorphism and kera = {@ € KNG;|x € Gi4i} = KNGj41. Hence, it remains 
to prove that a(K NG) < (Gi/Gi41). But if e € KANG; and y € Gi, then 


(yGiz1)a(@)(yGiz1)* = (yey!) Giya = a(yry™) € a(K Gi) 


because yxy? € (yKy") NG; = K NG. 
(2) Since KGi41 d KG, (Exercise 15 §8.1), the third isomorphism theorem 
(Theorem 7 §8.1) shows that 
(KG)/K _ KG; 
(KGis1)/K  KGig 


392 9. Series of Subgroups 


To show that (KG;)/(KGi41) is an image of G;/Gi41, define 
G; KG; 


= 


oS 
Gis1 -KGi41 


by a(2Gi41) = tK Gis1 for all x € Gj. 


This is well defined because G41 = yGj+1 implies that ybz € Gi44 C KGi41. It 
is clearly a group homomorphism, and (as K < G) it is onto because 


ke K Gi41 => a(x tka) KGiy1 = th Gis) = a(2Gii41) 
holds for all K€ K and z € Gj. a 


Now suppose that G = Gp D G; D--- D G, = {1} is a composition series for G. 
If K <G, the subnormal series for K and G/K in Lemma 1 are also composition 
series. Indeed, in both cases, the factors are isomorphic to either normal subgroups 
or images of the simple groups G;/Gj41 and so are all either simple or {1}. Hence, 
after we eliminate equalities, these series become composition series for K and G/K, 
respectively, each with factors from G. This proves part of Theorem 2. 


Theorem 2. Let G be a group and let K < G. Then G has a composition series if 
and only if both K and G/K have composition series. Moreover, in this case, 


(1) length G = length K + lengthG/K. 
(2) The composition factors of G are exactly those of K and G/K. 
(3) G has a composition series containing K. 


Proof. If G has a composition series, we have already seen that this is true of K 
and G/K. Conversely, suppose 


K=Ky> Ki 5---> Ky, ={1} and C= 255 ..5 Sa tKy 
are composition series for K and G/K. Because os = ail ae the series 


G=G)0G,9:+::9G,=K=Kk) DB D-+:-D Km = {1} 


is a composition series for G. Now (1)—(3) are apparent. a 


Corollary 1. If@: G— H is a homomorphism, then G has a composition series if 
and only if both ker @ and 0(G) have composition series. In this case, 


length G = length ker 6 + length @(G). 


Proof. Since G/ ker @ = 0(G), Theorem 2 applies with K = ker 0. 


Corollary 2. If Gj, Go,--: ,Gn are groups, then G; x Gp x --. x G, has a com- 
position series if and only if the same is true of each G;. In this case, 


length(G, x Gz x +++ x Gn) = length G; + length G2 +---+ lengthG,. 


Solution. We prove it for n = 2, and then the general case follows by induction. 
Define 0: G, x Go > Go by 6(91, 92) = ge for all g; € Gy and go € Ga. Then Oisa 
homomorphism, ker @ = G; x {1} & G1, and 0(G, x Gz) = Go. Use Corollary 1.0 


Example 8. Let G be an abelian group of order p{'p5?---prr, where p; are 
distinct primes. Show that length G =n, +ng+::-+7,. 


9.1. The Jordan—Hélder Theorem 393 


Solution. Proceed by induction on |G]. If |G| = 1, it is clear. In general, let K 
be a maximal (normal) subgroup of G. Then iG/K | is a prime divisor of |G, say 
|G/K| = py. Hence, |K| = pi’ ‘ph? -..p, so induction and Theorem 2 give 


length G = length K + length(G/K) = [(n1 -1)+ne+---+n,]+1 
= + Ngt-+4+Np. O 


We conclude this section with a proof of the Jordan—Hélder theorem. The proof 
requires the following lemma. 


Lemma 2, Let G be a group and let H and K be distinct maximal normal 
subgroups of G. Then HM K is maximal normal in both H and K. Moreover, 

H  G d K G 
HnK K ™“° KAH” 
Proof. We first claim that KH =G. Indeed, HC KH <G and KC KH dG, 
so if KH #G, the fact that H and K are maximal normal in G implies that 
H=KH=K, contrary to assumption. Hence, K ae G, so the second isomorphism 
theorem (Theorem 1 §8.1) gives ¢ = 42 = ,!_. Because G/K is simple, this 


shows that K MH is maximal normal in H. The rest is proved in the same way. &l 


Proof of the Jordan—Holder Theorem. Suppose the group G has a composition series 
G=G02 G19 G2D°:-DGn= {1} (1) 
of length n. We show by induction on n that every composition series 
G=Hj> MD A.D: > Am = {1} (2) 


for G is equivalent to series (1). If n= 1, then G is simple, so G, = {1} = Hy; 
and the theorem holds. So assume that n > 2 and that the theorem holds for all 
groups with a composition series of length less than n. In particular, it holds for Gy 
because Gy D Gg D-+: D G, = {1} has length n —1. If it happens that H, = Gi, 
then Gy D Hy D-:: D Hm = {1} is another composition series for Gi and so is 
equivalent to Gy D Gg D ++: D Gn = {1} by induction and the theorem follows. 

So assume that H, # G; and let HyNG,; =LIp D1, D---DL, = {1} be a 
composition series for H; MG, by Theorem 2 as Hy, NG, <J C. Now consider the 
following series for G: 


GDG1D (ANG) Diy D-:- DL, = {1}, (3) 
Go Ay > (Ai (G1) Dds De OB = fh. (4) 


As H,#G,, Lemma 2 asserts that Hj M G, is maximal normal in each of H, and G1, 
so both (3) and (4) are composition series for G. Moreover, G/G; © Hy /(HiN G1) 
and G/H, = G1/(Hi Gi) by Lemma 2, so (3) and (4) are equivalent; denote this 
as (3) ~ (4). Note that this equivalence holds even if s = 0, that is, if Hy) N Gi = {1}. 

Now Gi DG2D-::DG,={1} and Gi D(ALNG;) DN D-::- DLs = {I} 
are composition series for G; and so are equivalent by induction. This implies 
that (1) ~ (3) and also that n—1=s-+1: But then the composition series 
Hy > (HiNG)) D1, D-::DL,={1} has length n-—1 and so (again by 
induction) is equivalent to H; D Hz D---D Hy», = {1}. This in turn implies that 
(2) ~ (4). Piecing these equivalent series together gives (1) ~ (3) ~ (4) ~ (2), which 
proves the Jordan—Holder theorem. 2 


394 


9. Series of Subgroups 


Exercises 9.1 


1. 


us 


14. 


15. 


16. 


17. 


18. 


In each case, find the length of the group and exhibit the composition factors. 
(a) Cg (b) Che (c) Da (d) Ag (e) Q (see Section 2.8) 

If n > 1 and p is prime, show that Cy. has exactly one composition series. 
Find all composition series for 

(a) Coa 

(b) Co 

Find two nonisomorphic finite groups with identical composition factors. 

Find all composition series for Cy x Co. 

If n = pips? --- pr, find the length of D,. [Hint: Example 8.] 

Find a composition series for Dig that contains the center 7(D,,) and find one that 
does not contain Z(Djg). 

(a) For each m > 2, find a group of length m. 

(b) For each m > 2, find a group of length 1 with a subgroup of length m. 


. Let G= Ky x K, x--- x K,, where each K; is simple. Show that the groups K; are 


the composition factors of G. 
Describe the groups of length 2 by using Exercise 6 §8.1. 


. Let G= Gp D Gy D+: D G,= {1} be any subnormal series. If each factor Gi/Gi+1 


is finite, show that G is finite and that |G| = |Go/G4| -|G1/Ge|-+--|Gn-1/Gal. 


. Ifp is a prime, show that a finite group is a p-group if and only if all its composition 


factors are isomorphic to Cp. 


. Suppose that G is a group with a composition series. Show that any subnormal series 


G=G) 2G, D-:- DG, ={1} can be refined (by inserting groups if necessary) to 
a composition series for G. 

Suppose that G has a composition series with no two factors isomorphic. 

(a) Show that no two proper normal subgroups of G are isomorphic. [Hint: If H a G 
and K <j G, find a composition series through HK, H, and HN K and one through 
HK, K, and HONK. Use Exercise 13.] 

(b) Show that every normal subgroup of G' is characteristic in G. 

Let n = p{'p5?--- pr", where the p; are distinct primes and each n,; > 1. 

(a) Show that C,, has exactly r maximal normal subgroups. 


| 
(b) If m= 71 +no+++:+7,, show that C, has - composition series. 


n!ng!+--n,! 
| Hint: induct on m.] 
Prove the Zassenhaus lemma:!°° Let H, d H and K;, < K be subgroups of a 
group G. Then Wy(HN Ki) A Ay(HNK), Ki(AinK)aKi(HNK), and we 
have SACD = Ry [Hint: Each group is isomorphic to Tn) by 
Theorem 4 §2.10.] 
Prove the Schreier refinement theorem:/°! Two subnormal series G = Gop D 
G,D..-DG,={1} and H=H)DHW,D-:-DH»=1 can be refined (by 
inserting groups) in such a way that the resulting series are equivalent. | Hint: 

Gi = Gi4i(GiN Ho) 2 Gigi(GiN Mi) D+ D Gigi (GiN Hn) = Gigs. 
Do a similar construction with the H; and use the Zassenhaus lemma.| 
Use the Schreier refinement theorem to prove the Jordan—H6lder theorem. 


109Due to Hans Zassenhaus. 
11D ue to Otto Schreier. 


9.2. Solvable Groups 395 
9.2 SOLVABLE GROUPS 


In Section 9.1, we were concerned with groups that admit a composition series, that 
is, a subnormal series in which all the factors are simple. Although those groups 
are of interest, we obtain an even more important class of groups when the factors 
are required to be abelian rather than simple. 


A group G is called a solvable group! if there exists a subnormal series 


GaGp2G, 3 DGS {i} 


such that each factor G/G41 is abelian. Such a series is called a solvable series 
for G. Note that {1} is solvable. 


Example 1. Every abelian group G is solvable because G D {1} is a solvable series. 


Example 2. If p is a prime, every finite p-group is solvable. In fact, it has a 
composition series in which each factor is isomorphic to C, (Theorem 8 §8.2). 


Example 8. D, is solvable for each n. Indeed, D, has a cyclic subgroup H of 
index 2, so H < Dy, and D, D H D {1} is a solvable series. 


Example 4. S4 is solvable because S4 > Ay D K D {ec} is a solvable series, where 
Kk ={e, (1 2)(3 4), (1 3)(2 4), (1 4)(2 3)}. 


Example 5. If p and q are primes, show that any group of order pq is solvable. 


Solution. Because G is not simple by the Sylow theorems (Example 5 88.4), let 
K # {1} be a proper normal subgroup. Then |K|=p or |K|=4q; either way 
G > K D {1} is a solvable series with factors C, and Cy. 0 


Suppose that G is an abelian group. It is difficult to describe how to construct 
all groups G2 such that G; 4 Gg and G2/G is abelian. Nonetheless, suppose 
we carry out this construction and then repeat it to construct a group G3 such 
that G2 <1G3 and G3/Go is abelian. If we continue this procedure, each group 
constructed is clearly solvable, and we can obtain every solvable group in this way— 
constructed from the bottom up, as it were. Viewing solvable groups in this way 
is useful, but an analogous top-down construction is actually more important. It is 
based on the derived subgroup introduced in Section 2.9. 

Recall that an element of the form aba~b7! in a group G is called a commutator, 
denoted [a, b]. The set G’ of all products of commutators is a subgroup, called the 
derived subgroup of G, and has the following properties (Theorem 3 §2.9): 


(1) G’<G and G/G’ is abelian. 
(2) If K 4G and G/K is abelian, then G’ C K. 


For a group G, repeatedly taking the derived subgroup leads to a subnormal series 
of subgroups G D G! D G” DG” 3 --- in which each factor is abelian. There is a 
standard notation for these subgroups. Given a group G, construct subgroups G Oy. 
GO), G@),... of G as follows: 


(1) Define GO =G. 
(2) If G® has been constructed for i > 0, define G+) = [G@Y. 


102T hese groups are also called soluble groups. 


396 9. Series of Subgroups 


Thus, GO = G’, G@) = G", GS) = @", and so on. Furthermore, G¢+)) <q G® for 
each i and the subnormal series 


G=G%o.GMdaGe)d... 


is called the derived series for G. Note that G/G(+) is abelian for each 7 by 
Theorem 3 §2.9. The groups G) are called the higher derived subgroups of G, 
and they are actually normal in G as the next theorem shows. The proof requires 
the following lemma; we leave the proof as Exercise 8. 


Lemma 1. Let G denote a group and let H be a subgroup. 
(1) Ifa: G-— G is a homomorphism, then a(G’) C G’. 
(2) G' C H if and only if H < G and G/H is abelian. 
(3) If H CG is a subgroup, then H' CG’. 

Theorem 1. If G is a group, we have G 4G for alli > 0. 


Proof. We use induction on i> 0. It is clear if 1=0, so assume inductively 
that G® 4G for some i. If a€ G then o.(G) = GM, where og is the inner 
automorphism of G determined by a. But then Lemma 1 gives 


o4(G)] = oa[(GM))] Cc (Go) = G+), 


This shows that G¢+) < G and so completes the induction. | 
The solvable groups G are just those for which the derived series reaches {1}. 
Theorem 2. A group G is solvable if and only if G™ = {1} for some n > 1. 
Proof. lf G™ = {1}, then 
G=GO34G%3G@5...D4GM = {1} 

-is a solvable series for G because GO /G(+) = GO/[G) is abelian for each i. 
Conversely, let G= Gp DG; D--- DG, = {1} be a solvable series for G. It 
suffices to show that G® C G; holds for each i. This is clear if i= 0, so assume 
that G® C G; for some i > 0. As G;/Gi41 is abelian, we have Gi C G41. Hence, 

G+) ay [ey e Gi, Cc Gia 

by Lemma 1, which completes the induction. a 

If G is solvable, then G D G+) is strictly for all i by Corollary 1 of Theorem 3. 


Theorem 2 provides a quick method of establishing several basic properties of 
solvable groups. We begin with the following result. 


Theorem 3. Every subgroup and image of a solvable group is again solvable. 


Proof. Suppose that G is solvable and let G) = {1}. If H is a subgroup of G, it 
suffices to show that H® C G@ for each i. This follows by induction: It is clear 
when i = 0, and if H© ¢ G®, then Lemma 1 gives 
HG) — (AOS [GO] = Gtty, 
Now let a: G— K be an onto group homomorphism. It suffices to show that 
K® CalG] for each i. This is clear if i =0 because @ is onto, so assume that 


9.2. Solvable Groups 397 


K® Ca(G®). Then, given x and y in K®, write 2 = a(a) and y = a(b), where 
a,bé G™, Hence, 


[z, 4] = [a(2), a(0)] = a([a,b]) €a(GOY), 
so K+) C a(G) because KO) = (KY, | 
Corollary 1. If H # {1} is a subgroup of a solvable group G, then H’ + H. 


Proof. If H' =H, then H (2) = [H’! = H'=H and an induction shows that 
H® = H + {1} holds for each i. As H is solvable by Theorem 3, this result 
contradicts Theorem 2. @ 


Corollary 2. A simple group is solvable if and only if it is abelian (of prime order). 


Proof. Let G be solvable. If G is simple, then either G’ = {1} (so G is abelian) or 
G' = G, (contradicting Corollary 1). The converse is clear. | 


Example 6. If n> 5, the symmetric group S, is not solvable. For if S, were 
solvable, A, would be solvable by Theorem 3 so, as A, is simple, Corollary 2 would 
imply that A, is abelian, which is not the case. Hence, S,, is not solvable. 


Example 6 explains the origin of the term solvable. A classical problem in 
the theory of equations was to find a formula for the roots of a real polynomial 
L” + dn—10"-* +++++ a,x + ao in terms of the coefficients a;. If n = 2, the solution 
is the famous quadratic formula: 5 [-a: + fa? _ 7ao| . In general, such a formula 
should give the roots in terms of the coefficients a; using only arithmetic operations 
and the extraction of roots. Such formulas were found for n=3 and n=4, 
but the case n = 5 proved to be difficult. Call a polynomial f solvable if such a 
formula exists. It will be shown in Chapter 10 that f is solvable if and only if a 
certain group (called the Galois group of f) is a solvable group. For example, the 
polynomial «° — 6z + 2 has Galois group Ss (Example 1 §10.3) and so cannot be 
solvable. Incidentally, the first proof that a nonsolvable polynomial exists was given 
in 1824 by the young Norwegian mathematician Niels Henrik Abel, building on the 
work of Paolo Ruffini. 

Theorem 4 gives a useful way to show that a group is solvable. 


Theorem 4. If K <G, then G is solvable if and only if both K and G/K are 
solvable. 


Proof. Assume that K and G/K are solvable and let 
G Gy 
K= Ko Ky 22 {1} and K 2 2S 


be solvable series. Then 
G=G@)2G12---2G,=K=Kjy 2K, 2D-:-2 Km = {1} 


Gee _Gi/K 


is a subnormal series for G and the factors are abelian because Cut > Ganik for 
each 1. Hence, G is solvable. The converse follows by Theorem 3. 


Example 7. If Gi, G2,-:: ,Gn are groups, then G; x Gz x --+ X Gy is solvable if 
and only if the same is true of each G,. 


398 9. Series of Subgroups 


Solution. By induction, it suffices to prove it for n = 2. Let 0: Gy x Gg — Go be 
the projection given by 6(g1, 92) = ga. Then (G1 x Go)/ ker & 0(Gi x Go) = Ge 
and ker 9 = G, x {1} & G; are both solvable, so Theorem 4 applies. G 


The above theorems are valid for arbitrary groups. We now give some conditions 
equivalent to solvability in a finite group. 


Theorem 5. The following conditions are equivalent for a finite group G. 
(1) G is solvable. 
(2) The composition factors of G are all abelian. 
(3) H’ # H for every subgroup H # {1} of G. 


Proof. Note first that G has a composition series because it is finite. 

(1) = (2). Each composition factor is simple and solvable (Theorem 3), and so 
is abelian by Corollary 2 of Theorem 3. 

(2) = (3). Any composition series for G is a solvable series by (2). 

(3) = (1). The derived series G = GO D GM D GO) dD... reaches {1} because 
G is finite and GO > (GM) = GE for each i by (3). a 


Example 8. Let R be any ring. If n > 3, show that the group G of all invertible 
nm Xn matrices over R is not solvable. 


Solution. Let E;,; denote the n x n matrix with (i, j)-entry 1 and zeros elsewhere. 
Then Fy; Ej, = Bix, whereas Eyj;Ey, = 0 if j #1. If J is the n x n identity matrix, 
this shows that J + Ej; isin G whenever i # j and that (I + E,;)~! = I — E,;. Now 
let H be the subgroup of G generated by the matrices J + E,;,1 4. If, j, and k 
are distinct indices (they exist because n > 3), compute 


(1+ Exp)(I + Bpj)(I + Bin)" (0 + Eng) 
= (I+ Bix + Eny + Bij) (I — Ein — Exy + Ex) 
=I+ Ej. 


This shows that every generator of H is a commutator from H and hence H’ = H. 
Thus, G is not solvable by Corollary 1 of Theorem 3. Oo 


If F is a field, Example 8 shows that the general linear group GL, (F) of all 
n Xn invertible matrices over F' is not solvable if n > 3. If F' is finite, Theorem 5 
shows that a nonabelian simple group is lurking among the composition factors of 
GL, (F). In fact, such a group exists even if F’ is infinite. The mapping At det A 
is an onto homomorphism GL,,(F') — F* and the kernel is the special linear group 
SL,(£) of all matrices with determinant 1. It is not difficult to verify that the 
center of SL,(F') consists of all scalar matrices aI, where a € F' satisfies a” = 1. 
The factor group wr) 
SIn(F 
au ACD) 
is called the projective special linear group (of degree n) over F’. These groups 
comprise another infinite family of finite, simple, nonabelian groups (in addition to 
the alternating groups A,, n > 5). The theorem was proved in 1870 by Camille 
Jordan for Zp, p a prime and, in early 1900, Leonard Eugene Dickson proved it for 
all finite fields. 


9.2. Solvable Groups 399 


Theorem 6. Jordan—Dickson Theorem. If F is a finite field, then PSL,,(F) is 
a finite nonabelian simple group for all n > 2, except for PSL2(Z2) and PSL2(Z3), 


The proof is beyond the scope of this book.!% 


The class of solvable groups is large. Of course, it contains all abelian groups, and 
a celebrated theorem of William Burnside asserts that every group of order p"q™ is 
solvable, where p and q are primes. In a different direction, Georg Frobenius showed 
that every group of square-free order is solvable. In 1911, Burnside conjectured that 
every nonabelian finite simple group has even order, equivalently (Exercise 13) that 
every group of odd order is solvable. This conjecture remained an open question 
until 1963 when two American algebraists Walter Feit and John Thompson proved 
that it is true. The proof is 254 pages long and fills an entire issue of the Pacific 
Journal of Mathematics,'°* and it is widely regarded as the best single paper in 
finite group theory. Thompson went on to classify all minimal finite simple groups, 
that is, those in which every proper subgroup is solvable, and played an important 
role in the classification of all finite simple groups. He was awarded the Fields Medal 
in 1970, the highest honor a mathematician can attain. 

Even though the class of solvable groups is very large, many theorems are true 
for solvable groups that are not true of groups in general. One such theorem, a 
fundamental strengthening of the Sylow theorems in any solvable group, was first 
proved in 1928 by the British mathematician Philip Hall. 


Theorem 7. Hall’s Theorem. Let G be a group of order nm, where n and m 
are relatively prime. If G is solvable, then 

(1) G has a subgroup of order n and any two are conjugate. 

(2) Ifk|n, each subgroup of order k is contained in a subgroup of order n. 


We omit the proof.1® Hall went on to develop the theory of finite solvable groups 
and influenced an entire generation of group theorists. 


Exercises 9.2 


1. Is Z(G) # {1} for every solvable group G # {1}? Support your answer. 

2. If G is solvable, is N(H) # H for each subgroup H # G? Support your answer. 

3. Is G’ abelian for every solvable group G? Support your answer. 

4. Does every solvable group of order n have a subgroup of order m for each divisor m 
of n? Support your answer. 

. Give an example of a nonsolvable group in which every Sylow subgroup is abelian. 

6. Show that a nonsolvable group of minimal order must be simple. 


on 


103See Kargapolov, M.I. and Merzljakov, J.1., Fundamentals of the Theory of Groups, Springer- 
Verlag, 1979; Rotman, J.J., The Theory of Groups: An Introduction, 2nd ed., Boston: Allyn & 
Bacon, 1973; Artin, E., Geometric Algebra, New York: Interscience, 1957. For an elementary 
proof when n = 2, see Lang, S., Undergraduate Algebra, Berlin: Springer-Verlag, 1987. 

104F eit, W. and Thomson, J.G., Solvability of groups of odd order, Pacific Journal of Mathematics, 
13 (1963), 775-1029. 

105See Kargapolov, M.I. and Merzljakov, J.I., Fundamentals of the Theory of Groups, Springer- 
Verlag, 1979; MacDonald, I.D., The Theory of Groups, London: Oxford University Press, 1968; 
Rotman, J.J., The Theory of Groups: An Introduction, 2nd ed., Boston: Allyn & Bacon, 1973. 


400 


10. 


11. 


12. 


13. 


14. 


15. 
16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


9. Series of Subgroups 


Suppose G has a solvable, maximal normal subgroup. Either prove G is solvable or 
give a counterexample. 


. Prove Lemma 1. 
. If |G] = p*q, p,q primes, show that G is solvable. [Hint: Exercise 14 §8.4.] 


If |G| = pq”, p,q primes, show that G is solvable. [Hint: Example 9 §8.4 and the 
preceding exercise. | 
(a) Show that 


is a solvable group for any field F. 


(b) Show that 
x b 
G= 0 c 
0 z 


is a solvable group for any field F. 
If p and gq are primes, show that every group of order p™g” is solvable if and only if 
the only simple groups of this type are the cyclic groups of order p or q. (Burnside 
proved that these statements are true.) 
Show that every group of odd order is solvable if and only if every finite nonabelian 
simple group has even order. (These statements are true by Feit and Thompson.) 
Find the composition length of a solvable group of order n = p{!py? ---p?", where 
p; are distinct primes. [Hint: Example 8 §9.1.] 
Show that a solvable group is finite if and only if it has a composition series. 
Show that the following are equivalent for a group G. 
(a) G is solvable. (b) G’ is solvable. (c) G/Z(G) is solvable. 
If H dG and K 4G show that G/(H MK) is solvable if and only if the product 
(G/H) x (G/K) is solvable. 
If Kk; 4G for i=1,2,--- ,n, put K=K,N Kon: NK. If G/K; is solvable for 
each i, show that G/K is solvable. 
If H and K are solvable subgroups of G and K <1 G, show that HK is solvable. 
(a) If G is a finite group and Z(G/K‘) is nontrivial for all K a G, K # G, show that 
G is solvable. 
(b) Show that the converse of (a) is false for a finite group G. 
If G # 1 is solvable, show that 
(a) G has a nontrivial abelian factor group. 
(b) G has a nontrivial abelian normal subgroup. 
Show that the following are equivalent for a nontrivial finite group G. 
(1) G is solvable. 
(2) Every nontrivial normal subgroup of G has a nontrivial abelian factor group. 
(3) Every nontrivial factor group of G has a nontrivial abelian normal subgroup. 
If G is a finite group, define R= R(G) ={\{K 4 G|G/K is solvable}. 
(a) Show that R= R(G) is the smallest normal subgroup of G such that G/F is 
solvable. [Hint: Exercise 18.] 
(b) Show that G is solvable if and only if R(G) = {1}. 


£,Y,z,a,b,c€ Fy xyz # | 


oe 8 


9.3. Nilpotent Groups 401 


(c) If a:G—-H is a group homomorphism, show that a[R(G)] C R(H). 
Hint: Consider {k | a(k) € R(H)}. 
(d) If H C G is a subgroup, show that R(H) C HN R(G), with equality if H a G. 
24. If G is a finite group, define S(G) = [|{K < G| K is solvable}. 
(a) Prove S(G) is the largest solvable, normal subgroup of G. [Hint: Exercise 19,] 
(b) Prove G is solvable if and only if $(G) = G. 
(c) If a: GZ is an onto group homomorphism, show that a[S(G)] C S(H). 
(d) If H C G is a subgroup, show that HN S(G) C S(H), with equality if H aG. 
(e) Show that S(G/S) = {1}. 
25. A group G is called polycyclic if it has a solvable series with every factor cyclic. 
(a) Show that every finite solvable group is polycyclic. 
(b) Show that every polycyclic group is finitely generated. 
(c) Show that every subgroup and homomorphic image of a polycyclic group is 
polycyclic. [Hint: Lemma 1 §9.1.] 
(d) If K < G, show that G is polycyclic if and only if K and G/K are polycyclic. 
(e) Show that the following are equivalent for a group G. [Hint: Theorem 3 §7.2.] 
(i) G is polycyclic 
(ii) Every subgroup of G is solvable and finitely generated. 
(iii) Every normal subgroup of G is solvable and finitely generated. 
26. A class V of groups is called a subvariety if {1} € V and each subgroup and ho- 
momorphic image of a group in Y is again in V. Examples: abelian groups, p-groups 
for a fixed prime p and torsion groups (each element has finite order). If V is a 
subvariety, a group G is called Y-solvable if there is a subnormal series G = Gp D 
Gy D+ D Gr= {1} with G;/G.41 in V for each i. If G is a group and K < G, show 
that G is V-solvable if and only if both K and G/K are )-solvable. (Hint: Lemma 1 
§9.1.] 
27. A subvariety V of groups (Exercise 26) is called a variety if, in addition, G x H is in 
VY whenever G' and H are in V. Examples: abelian groups, p-groups for a fixed prime 
p, torsion groups, and Y-solvable groups, where V is any subvariety (by Exercise 26). 
If V is a variety and G is a finite group, the V-derived subgroup of G is defined 
to be (iG) =(\{K 1 G|G/K is in V}. Let G denote a finite group. 
(a) Show that V(G) < G and G/V(G) is in V. 
(b) If K <4 G, show that G/K is in V if and only if VG) C K. 
(c) If H is a subgroup of G, show that V(H) C V(G). [Hint: Lemma, 2 §9.1,] 
(d) If a: G— H is a homomorphism of groups, show that a[V(G)] C VA). 
28. If V is a variety of finite groups, define Vo(G) = G and Ve4i(G) = V[Va(G)] for each 
k > 0. Let G denote a finite group. Show that 
(a) Ifa:G— H is a group homomorphism, then a[Y,.(G)] C V;.(#) for all k. 
(b) V;,(G) < G for each k. 
(c) G is V-solvable if and only if V,(G) ={1} for some k. 
(d) Every subgroup of a -solvable group is V-solvable. 
(e) G is Y-solvable if and only if V(H) + H for all subgroups H # {1} of G. 


9.3 NILPOTENT GROUPS 


If G is a group, the definition of the derived subgroup G’ guarantees that G is 
abelian if and only if G’ = {1}. If the process of taking the derived subgroup is 


402 9. Series of Subgroups 


iterated, the derived series G = GO D GH DGD... is obtained, and G is 
solvable if and only if this series (of normal subgroups of G) reaches {1} in a finite 
number of steps (Theorem 2 §9.2). Note that G@) = G’. Now the center Z(G) plays 
an analogous role to G’ in the sense that G is abelian if and only if Z(G) = G. In 
view of this, an irresistible question arises: Is there a way to iterate the formation 
of the center so as to create a series {1} = % C Z, C Z_ C -++ of normal subgroups 
of G (with Z(G) = Z,) such that G is solvable if and only if this series reaches G in 
a finite number of steps? The answer is yes and no. Yes, there is a natural way to 
define such a series. No, it does not characterize the solvable groups in this way. 
Rather, it characterizes a smaller class of groups called the nilpotent groups. In 
this section, we define these groups and show that the finite ones are precisely the 
finite groups that are isomorphic to the direct product of their Sylow subgroups. 


Central Series 


If G is a group, define a series Z(G), Z1(G), Z2(G),... of normal subgroups of G 
inductively as follows: 


(1) Take Z(G) = {1}. 
(2) If Z(G) <4 G has been constructed, define Z;41(G) the unique normal 


subgroup of G that contains Z;(G) and satisfies Z (a%5) = Zt. 


Then 241(G) > Z(G) and Z;(G) 4 Z41(G) for each ¢ > 0, and the series 
{1} = 2(C).6 Z(G) S M(@) C-- 
is called the ascending central series of G. Note that 


Z(G) = Z(G) because Z (S) es Z(G) 


The ascending central series may never reach G, even if G is solvable: 


Example 1. Suppose Z(G) = {1} where G# {1}—for example the solvable group 93. 
Then Z;(G) = Z(G) = {1}, so we have Z2(G) = #Q = z (a45) & Z(G) = {1}. 
Hence, Z(G) = {1}, and this process continues inductively to show that 
Z(G) = {1} for all k. 


To characterize the groups G for which the ascending central series reaches 
G, it is useful to define a related descending central series. This requires a 
new notion. Recall that the derived subgroup G’ is generated by the commutators 
[a,b] = aba~ +b in G. We extend this idea as follows. If H and K are subgroups 
of a group G, define 


[H, K] = ({[h,k] | he H andk € K}) 
to be the subgroup generated by the commutators [h,k], with he H and ke kK. 
Note that [h, k]~1 = [k,h] for all he H and k € K. Hence, [H, K] = [K, H], and 
this group consists of all products of commutators of the form [h, k] or [k, h], where 
he H andk € K. In particular, [G,G] = G’. Lemma 1 collects several other useful 
facts about these subgroups. 


Lemma 1. Let H, K, Hi, and K, be subgroups of a group G. 
(1) [H, K] = [K, H]. 
(2) If Cc Ay and K & Ky, then [H, Kk] c [Ai, Ky]. 


9.3. Nilpotent Groups 403 


(3) fH < Gand K <G, then [H, K] dG. 
(4) H 4G if and only if |H,G] C H. 
(5) Suppose that K C H CG and K <G. Then . 
H/K CZ(G/K)  ifandonlyif [H,G|] CK. 


Proof. We prove (5) and leave the rest as Exercise 2. If - CZ (<), then we have 
hKgK = gKhK for all g € G and h € H; that is, [h, g] € K. Hence, [H,G] C K. 
Since this argument works in reverse, we have proved (5). | 


Now, given a group G, define a series [9(G), [1(G), T'2(G),... of subgroups of 
G inductively as follows: 


(1) Take T9(G) =G. 
(2) If1';(G) has been constructed, define T';41(G) = [[;(G), G]. 
Then [';(G) 4G for all i > 0 by induction on 7 (using (3) of Lemma 1), whence 


Ti41(G) = [C;(G), G] CT(G) for each i by (4) of Lemma 1. Thus, we obtain ‘a 
series of normal subgroups 


G=T0(G) DTi(G) DT2(G) D-:-. 


This is called the descending central series of G. The name comes from the fact 
that, using (5) of Lemma 1, 
[;(G) G 
C Z| 
Piyi(G) ~  [Ti41(G) 
holds for each i > 0. Note that 
T1(G) = [G,G] =G" is the derived subgroup of G. 

If G is an abelian group, then Z;(G) = G and I'\(G) = {1}. On the other hand, 
there are groups (even solvable ones by Example 1) for which the ascending central 
series does not reach G and the descending central series does not reach 1. However, 
if either possibility occurs, so does the other. 


Lemma 2. The following are equivalent for a group G and an integer n > 0: 
(1) Tn(G@) = {1}. 
(2) Zn(G) =G- 
(3) A series G= Go DG; D+:: D Gy = {1} exists with G; < G for each 1 and 
Gi/Gisa C 2(G/Gi41). 


Proof. Write T;(G) =T; and Z2;(G) = Z; for each i. 

(1) = (2). If T, = {1}, we show that [,_; C 2; for each i=0,1,2,... (so 
Zn =G). This is clear if 1=0 by (1), so assume T',-; C Z, where 7 >0. If 
a €Py_;-1, then, for all g € G, [a, 9] € [Tn—i-1, G] =Tn-i C 2. Thus, aZ; is in 
the center of G/Z; and so a € 241. Hence, Tp_;_-1 C 441 as required. 

(2) = (3). Given (2), use G = Z, D Zp_1 D +++ D Zo = {1} in (8). 

(3) > (1). Given (3), we show that I; C Gj for each i = 0,1,2,... (son = {1}). 
This is clear if i = 0, so assume that IT; C G; for some i > 0. We must show that 
(Ci, G] = Tiga C Gizi, so we show that [a,g] € Giz1 for all ae Ti, g EG. But 


See % ( S ) by (3), soa €T; C G; implies that aGj,1 commutes with gGi41 


Gisi -—. Gi41 
for all g € G. This in turn implies that [a, g] € Gi41, as required. 


404 9. Series of Subgroups 


A group G is called a nilpotent group if the conditions in Lemma 2 are satisfied for 
some n > 0. The smallest integer n for which ’,,(G) = {1}, equivalently Z,(G) = G, 
is called the nilpotency class of G. Thus, if G is nilpotent, then G has class 1 if and 
only if it is abelian, and (Exercise 11) G has class 2 if and only if it is nonabelian 
and G’ C Z(G). 

A series as in (3) of Lemma 2 is called a central series for G. Suppose that 
G = Gp DG; D--: D G, = {1} is any central series for a nilpotent group G. Then 
the proof that (3) = (1) in Lemma 2 derives the first of the following inclusions: 


[T;(G) C Gi Cf Zn_-s(G) forO<i<n. 


We leave the second inclusion for the reader (Exercise 7). Hence, we often call the 
series G =I'9(G) DI (G) DT2(G) D--+ and 1= Z(G) C Z(G) € Z(G) C-:: 
the lower and upper central series, respectively. 


Example 2. Every abelian group is nilpotent. 


Example 3. If p is a prime, every finite p-group is nilpotent. In fact, if we write 
Zi(G) = Z; for each 1, Theorem 6 §8.2 shows that 2;41/Z; = Z(G/Z;) is not trivial 
if Z; # G because G/Z; is a p-group. Hence, 1 C Z, C Z2 C --+ , which eventually 
reaches G because G is finite. 


Example 4. Show that every nilpotent group G is solvable, but not conversely. 


Solution. If G is nilpotent, the series {1} = Z(G) C--- C Z,(G) = G is a solvable 
series. However, 53 is solvable but not nilpotent by Example 1. O 


Theorem 1. Every subgroup and image of a nilpotent group is again nilpotent. 


Proof. Let G be nilpotent. To show that a subgroup K C G is nilpotent, it suffices 
to show that T;(K) CT;(G) for each i. This is clear if i = 0. If (kK) CT;(G) for 
some 7, then the induction goes through because 


Pisi(K) = (Pi(K), K] C [Pa(G), G] = Piya (G). 


Now let a: G—> H be an onto homomorphism; we show that T';(H) C a[f;(G)| 
for each i. If 1 = 0, it is because a is onto. In general, let [;(H) C a(T;(G)), and 
let y € T';(H) andh € H. Write y = a(x), where x € I';(G), and (as a is onto) write 
h=a(g),9 € G. Then 


[y, h] = [a(x), a(g)] = ofz, g] € of's(G), G] = a(Pi41(@)). 
Hence, Pi41(H) = [0 ;(#), A] C a(Ti41(G)), as required. fa 
Corollary. A group G is nilpotent if and only if G' is nilpotent. 
Proof. Write D = G'. If D™ = {1}, then G**) = D™ = {1}. a 


By Theorem 1, if K < G and G is nilpotent, then both K and G/K are nilpotent. 
The converse is false (S3 is again a counterexample) in contrast to the situation for 
solvable groups. However, the converse does hold when K C Z(G). 


Theorem 2. IfG is a group, K C Z(G), and G/K is nilpotent, then G is nilpotent. 


Proof. Assume that G/K is nilpotent, and let 9: G— G/K be the coset map. By 
hypothesis, let 0(G) = Xp D X1 D--- D X, = {K} be a central series for 0(G), 


9.3. Nilpotent ‘Groups 405 


where 3 AC A(SD) for 0S i<n-1. Write X; = G;/K = 0(G;) for0<i<n. 
Then Pe obtain the | series 
G=Go 2Gi 2---2G,= K D Gpy1 = {1}. 

Gn ~ KC Ze) — AG) 
Gn4i ~ {1} = SN {1} 4 fT} 
because K C Z(G). To see that oy C2 ge ) for O<i<n, let a€ G,. 
Then @(a) € 0(G;) = Xi, so O(a) Xi4s commntes ath 6(g) Xiz1 for all g EG. 
Thus, 9a, 9] = [0(a), 0(g)] € Xiz1 = 0(Gi41), say O[a,g] = 0(b), be Gizs. Thus, 
[a,gjot Eker @=K CGi4i1, so [a,9] € Gi4ub=Giy1. This means aGi41 
commutes with gGi41, that is, aGi41 € Z( cree ), as required. a 


We show this is a central series for G. First, 


The next result will be needed in Theorem 4. 
Theorem 3. If G1, Go,:+: ,Gn are nilpotent, so also is Gj X Gg X +++ X Gp. 


Proof. This follows because T';(Gy x G2 x --- x G,) CT; (G1) x ++» x Ts(G,) for 
each 7, a fact that we leave as Exercise 6. zl 


Theorem 3 and Example 3 combine to show that any finite direct product of 
finite p-groups (for various primes p) is nilpotent. In fact, every finite nilpotent 
group is isomorphic to such a direct product. We need the following notion. 

A subgroup M of a group G is said to be maximal in G if M # G and the only 
subgroups H such that MC H CG are H = M and H =G. Clearly, every proper 
subgroup K of a finite group is contained in a maximal subgroup—one of maximal 
order containing K. If G is finite, every subgroup of prime index is maximal by 
Example 6 §2.6.1°° The converse is not necessarily true (any subgroup of index 
4 in A, is maximal), but it does hold in a finite p-group. Moreover, in this case, 
the maximal subgroups (of index p) are necessarily normal (see the corollary to 
Theorem 1 §8.3). This property characterizes the finite nilpotent groups. 


Theorem 4. Burnside—Wielandt Theorem.” The following conditions are 
equivalent for a finite group G # {1}: 

(1) G is nilpotent. 

(2) N(H) # H for all subgroups H # G of G. 

(3) Every maximal subgroup of G is normal in G. 

(4) Every Sylow subgroup of G is normal in G. 

(5) G is isomorphic to the direct product of its Sylow subgroups. 


Proof. (1) = (2). Write Z; = Z;(G) for each i and assume that Z, = G. If H#G 
is a subgroup of G, then Z) C H but Z,, ¢ H, so an integer k > 0 exists such that 
Ze CA but Zrii g H. Choose a € Zp41, a¢ H. Then aZ, is in the center of 
G/Z,, so ifh € H, aZ;, and hZ, commute. Hence, hah~a7! € Z, C H, from which 
aHa-! C H. Thus, a € N(H), and so N(H) # H. 

(2) => (3). Let M be a maximal subgroup of G. Since M C N(M) CG, (2) 
implies that N(M) = G. Hence, M <G. 


106 This is also true if G is infinite by Exercise 31, §2.6. 
107 The name honors William Burnside and Helmut Wielandt. 


406 9. Series of Subgroups 


(3) = (4). Suppose P is a nonnormal Sylow p-subgroup of G. Then N(P) # G, 
so let N(P) C M, where M is a maximal subgroup of G. Because P C M, (3) 
gives aPa7! CaMa = M for all a € G. Hence, both P and aPa™ are Sylow 
p-subgroups of M and so are conjugate in M, say P = m(aPa™')m7! for some 
m € M. But then ma € N(P), so a € M. Because a € G was arbitrary, this means 
G C M, a contradiction. This proves (4). 

(4) = (5). Let Py, Po,--- , P, denote the distinct Sylow subgroups of G. 


Claim 1. P,P -++Py & Py x Po x-++- xX Py for each k = 2,3,--+ yr. 


Proof. It is clear if k = 1. Assume inductively that P| P2.---P, = Py x Po x -+++ x Pr 
for some k>1. Then (Pi P2:-- Px) Pei1 = {1} because elements in the two 
subgroups have relatively prime orders. By Theorem 6 §2.8, 
(Pi P2+++ Py) Prat _ (Pi x P» Ros x Px) x Prat =P, x Po Xeere x Pry 

because P; P2--:P, <1G and Pyii < G. This proves the claim. 

The claim gives |P\P2---P,| =|Pi||Pe|---|P,| = |G]. Hence G = P,P2:--P, 
and (5) follows, again by the Claim. 

(5) = (1). This follows from Theorem 3 and Example 3. | 


It is interesting to compare (2) in Theorem 4 with the result (Theorem 5 §9.2) that 
a finite group G is solvable if and only if H’ # H for every subgroup H + {1}. 

Since every finite abelian group is nilpotent, the implication (1) = (5) in Theo- 
rem 4 gives another proof of the primary decomposition theorem for finite abelian 
groups (Corollary 2 of Theorem 3 §7.2). We reformulate (1)<>(5) as 


Corollary 1. A finite group G is nilpotent if and only if G is isomorphic to a finite 
direct product of p-groups for various primes p. 


Frattini and Fitting Subgroups 


One of the most important aspects of the study of nilpotent groups is that every 
finite group G contains a nilpotent subgroup ©, which is characteristic in G (that 
is, o(®) = © for every automorphism o of G—these subgroups are discussed in 
Corollary 3 of Theorem 3 §2.8). We now turn to a discussion of this. 

If G $ {1} is a finite group, define the Frattini subgroup 


6(G) =(}{M CG|M is a maximal subgroup of G}. 
Define {1} = {1}. This was introduced in 1885 by Giovanni Frattini. 


Example 5. ®(A4) = {e}. Indeed, K = {e,(1 2)(3 4), (1 3)(2 4), (1 4)(2 3)} 
is maximal, being of index 3, and M = {e,(1 2 3),(1 3 2)} is maximal (it has 
index 4, but Aq has no subgroup of index 2). Hence, we have (A) C KN M = {e}. 
Example 6. If Q = {+1, +i, +7, +k} is the quaternion group, then ®(Q) = {1,—-1} 
because (i) , (7), and (k) are the only maximal subgroups. 
Example 7. If G = (a) and o(a) = p”, where p is a prime, ®(G) = (a?) because 
(a?) is the unique maximal subgroup (of index p). 
Theorem 5. Let G be a group and write ® = ®(G). Then the following hold: 

(1) Ifa@:G-— H is an onto group homomorphism, then a(®) C ®(H). 


9.3, Nilpotent Groups 407 


(2) In particular, ® is a characteristic subgroup of G. 
(3) G is nilpotent if and only if G’ C ©. 


Proof. (1) If U C H is a maximal subgroup, we must show that a(®) C U. If we 
define M = {meéG|a(m) € U}, then it suffices to show that M is a maximal 
subgroup of G (since then © C M). But if MC K CG are subgroups of G then 
a(M) C a(K) C a(G). Since a is onto, this is U C a(K) C H, so a(K) =U or 
o(K) = H. These imply that K = M or K =G. 

(2) This follows from (1) if H = G and a is an automorphism of G. 

(3) G’ C © if and only if G’ C M for each maximal subgroup M of G, if and 
only if M <G for each M, and if and only if G is nilpotent by Theorem 4. a 


Corollary 1. The following are equivalent for a finite group G: 
(1) G is nilpotent. 
(2) G/®(G) is abelian. 
(3) G/®(G) is nilpotent. 


Proof. (1)<>(2) restates (3) of Theorem 5 and (2)=>(3) is obvious. 

(3)=(1). Given (3), we show that every maximal subgroup M of G is normal. 
Write 6 = &(G). Since 6 C M, the subgroup M/® is maximal in G/® by the 
correspondence theorem, so M/® < G/® by (3) and Theorem 4. But then M 4G, 
again by the correspondence theorem. 4 


To see that ®(G) is a nilpotent group, we first characterize it in terms of the 
following concept. An element ¢t € G is called a nongenerator in G if it can be 
omitted from any generating set X of G; that is, if G = (X U {t}), then G = (X). 


Theorem 6. Let G denote any finite group. Then the following hold: 
(1) ®(G) = {t | ¢ is a nongenerator of G}. 
(2) ®(G) is a nilpotent group. 


Proof. For convenience, write ® = &(G). 

(1) Write N = {t|¢ is a nongenerator of G} and let ae ®. If ag N, then 
X C G exists such that (X U {a}) = G but (X) #G. So let (X) C M, where M isa 
maximal subgroup of G. Then a € M because ® C M, whence G = (X U {a}) C M, 
a contradiction. Hence, @ C N. Conversely, ift € N and M is a maximal subgroup 
of G, then g € M (otherwise (M U {t}) = G). Hence, N C ®. 

(2) By Theorem 4 we show that every Sylow p-subgroup P of ® is normal in 
G. If g € G, then gPgt C g&g-! = ® by (2) of Theorem 5. Hence, both gPg™! 
and P are Sylow p-subgroups of ® and so are conjugate in ©, say t(gPg71)t"! = P, 
where t € ®. Thus, tg € N(P), which yields G = ®N(P). But then G = (®U N(P)) 
so, as ® is finite, G = (N(P)) = N(P) by (1). Hence, P < G as required. 


Note the proof of (2) shows that Sylow p-subgroups of ®(G) are normal in G. 


Corollary 1. Assume that ®(G) is finitely generated (for example, if G is finite). 
If H®(G) = G, where H is a subgroup, then H =G. 


Proof. Write 6(G) = ®. If ® = (t,...,t,), then G= (HU {t),...,t,}) and the 
nongenerators ¢t; can be removed one by one. El 


408 9. Series of Subgroups 


The next result extends a useful theorem about finite p-groups. 
Theorem 7. If G is a nilpotent group and {1} # H 4G, then HN Z(G) # {1}. 


Proof. Write T; =T;(G) for each i. We have G=I'p DT, D-:- DI, = {1} 
for some n, so HOT, = {1} while HNTo # {1}. So there exists k such that 
HOT, # {1} while HOTx41 = {1}. Choose 1#heE HNT,. If g EG, then 
[hg] € (Te,G]=Teyi. Also, [hg] =h-'g-hg€ H(g"'Hg)=H because 
H dG. So [h, g} € HAT e41 = {1}, whence hg = gh. Thus, h € Z(G) NH. | 


In 1938, Hans Fitting identified a largest nilpotent normal subgroup in every 
finite group. His key result was 


Theorem 8. Fitting’s Theorem. If H and K are nilpotent, normal subgroups 
of a finite group G, so also is HK. 


Proof. We proceed by induction on |G|. We have HK <1G, so we may assume 

(by induction) that G= HK and that H# {1} 4K. Write W = Z7(K). Then 

W + {1} by Theorem 7. Also, W < G being characteristic in K < G. If we write 
= [W, H], the proof falls into two cases. 


Cae 1. N = {1}. Then W centralizes H (and K), so W C Z(HK) = Z(G). But 


HW K HWw _H 
iB = “q7 w.- Moreover, 4- = Haw and 4 are both nilpotent by Theorem 1, so 


Gi is nilpotent by induction. Hence, G is spelen by Theorem 2. 

Case 2. N # se We have N CWN4H because W < c and H <4 G.In pace 

V =N2Z(H) # {1} again by Theorem 7. As before, $ v= = #4) KV and both # v7 and 

ae are ioe so 7 is nilpotent by induction. But V centr sees HT, and it also 

centralizes KK because V C N CW = Z(K). Hence, V C Z(HK) = Z(G), so G is 
a 


nilpotent by Theorem 2 and we are done in this case too. 


Now let G be any finite group. If N; = {1}, No,..., Ng denote all the nilpotent, 
normal subgroups of G, define the Fitting subgroup F(G) of G by 


Then F(G) < G, and it is nilpotent by Theorem 8 and induction on k. This proves 


Theorem 9. If G is a finite group, then F(G) is the largest nilpotent, normal 
subgroup of G in the sense that it contains every such subgroup. 


Corollary 1. Ifa: G— H is an onto homomorphism, then a[F'(G)| C F(#). 
Proof. a[F(G)] < H because a is onto, and it is nilpotent by Theorem 1. 


Hence, a finite group G is nilpotent if and only if F(G) = G. Clearly, Z(G) C F(G), 
and ®(G) C F(G) because it is nilpotent (Theorem 6) and normal in G (Theorem 
5). Moreover, F'(G) is a characteristic subgroup of G by the above corollary. 


Lemma 3. If G is finite and N <i G, then N is nilpotent if and only if N’ C ®(G). 


Proof. Write ®(G) = ®. If N’ C ®, then N’ is nilpotent by Theorem 6, and so 
N is nilpotent by the corollary to Theorem 1. Conversely, if NV is nilpotent, then 
N' C GN) by Theorem 5, and it remains to show that ®(N) C ®. Write ®(N) = H 


9.3. Nilpotent Groups 409 


and suppose H ¢ ©. Then H ¢ M for some maximal subgroup M of G,so HM =G. 
Since H C N, the modular law (Lemma 1 §8.1) gives 
N=GNN=HMON=H(MNN). 


Thus, N =(HU(MNN)) and H = ®(N) consist of (a finite number of) non- 
generators of N. Hence, N = MN, whence N C M, a contradiction. a 


We can now describe the relationship between the Frattini and Fitting subgroups 
in a finite group. 
Theorem 10. Let G be a finite group and write ® = ®(G) and F = F(G). 

(1) FP COCE. 

(2) F(G/®) = F/®. 
Proof. (1) We have F’ 4G because it is characteristic in F << G, so F’ C ® by 
Lemma 3 because F' is nilpotent. On the other hand, ® C F by Theorem 9 because 


© is nilpotent. This proves (1). 
(2) This follows from a more general result (Theorem 11).1°8 | 


Theorem 11. If G is a finite group and ®(G) CN 4G, then N is nilpotent if 
and only if N/®(G) is nilpotent. 


Proof. Write ®(G) = ©. If N is nilpotent, so is its image 3 For the converse, 
assume that & is nilpotent. To show that N is nilpotent, we show that every Sylow 


p-subgroup P of N is normal in N (and invoke Theorem 4). First, ve is a Sylow 
p-subgroup of g (by Example 4 §8.4), whence ue <J 5 by hypothesis. But then 


ee is characteristic in & <J §: and it follows that ue <J $- Thus, ®P dG. 
But P is a Sylow p-subgroup of ®P (because P C N), so G = (@P)Ng(P) by 
Lemma 1 §8.4. Since G is finite and ® consists of nongenerators, it follows that 
G = PNo(P) = Nc(P). Hence, P <i G, so certainly P < N as required. | 


There is much more information available on nilpotent groups in books on group 
theory.1°° 


Exercises 9.3 


1. (a) Show that A, is not nilpotent ifn > 3. 
(b) Show that every nilpotent group is solvable, but not conversely. 

2. Prove (1)-(4) in Lemma 1. 

3. If H and K are subgroups of G and a: G— G, is a homomorphism, show that 
a{H, K] = [a(H),a(K)]. Conclude that [H, K] is normal in G (characteristic in G) 
if the same is true of H and K. 


108Because of (2) and the corollary to Theorem 9, the Fitting subgroup of G is also called the 
nilpotent radical of G. ; ; 

109The following books contain excellent introductions to the theory of nilpotent groups: Mac- 
Donald, I.D., The Theory of Groups, London: Oxford University Press, 1968; Rose, JS., A 
Course on Group Theory, Cambridge, England: Cambridge University Press, 1978; Kargapolov, 
MI. and Merzljakov, J.I., Fundamentals of the Theory of Groups, New York: Springer-Verlag, 
1979; Gorenstsein, D., Finite Groups, 2nd ed., New York: Chelsea, 1980. 


410 


25. 
26. 


9. Series of Subgroups 


. If a: G— H is any homomorphism, show that a[[;(G)] =T;[a(G)]. 
. If G® is the kth derived subgroup of G (see Theorem 1, Section 9.2), show that 


G&+) = [G), G®] for each k > 0. 


. (a) Show that T,(G, Xr kK Gn) Cc Ti(G1) Koreex T.(G,) for alli > 0. 


(b) Show that equality holds in (a). 


. If G=Go DG, D--- 2D G,= {1} is any central series for a group G, show that 


Gn-i © Z(G) for each i. 


. Let 


G= { le i Jlebee Fiac do}, 


where F is a field. Is G nilpotent? 


. Show that D,, is nilpotent if and only if n is a power of 2. 
. Show that a finite group is nilpotent if and only if any two elements of relatively 


prime orders commute. 


. Show that a group G is nilpotent of class 2 if and only if G is nonabelian and 


G’c Z(G). 


. Show that a finite group G is nilpotent if and only if Z(G/K) is nontrivial for all 


K<4G, K #G. [Hint: Theorem 7.] 


. If G is a finite nilpotent group, let K be of minimal order in {K | {1} 4 K a G}. 


Show that K C Z(G) and that |K| is a prime. [Hint: Theorem 7.] 


. If dG and K 4G, show that G/(H NK) is nilpotent if and only if the product 


group (G/H) x (G/K) is nilpotent. 


. Show that a finite group G is nilpotent if and only if G has a normal subgroup of 


order m for every divisor m of |G]. 


. A subgroup H of a group G is called subnormal in G if a chain of subgroups 


H=H) CH, C:-:-CH, =G exists such that H; 4d H;,, for each i. Show that a 
finite group G is nilpotent if and only if every subgroup is subnormal. 


. If K C Z(G) and G/K is nilpotent, show that G is nilpotent using only (3) of 


Theorem 4. 


. (a) If G is nilpotent, show that Z(H) # {1} for all subgroups H # {1}. 


(b) Show that the converse is false by considering Qs. 


. If G is nilpotent and G/G’ is cyclic, show that G is abelian. [Hint: Apply Theorem 


2 §2.9 to G/[['2(G)] and conclude that P'2(G) = 11(G).] 


. If G is a finite group show that (2) + (3) = (4) and (2) = (1), but (1) = (2). 


(1) G' is abelian. (2) G/Z(G) is abelian. (3) T2(G)={1}. (4) Z(G) =G. 


. Let D,= (a,b) where o(a) = n, 0(b) = 2, and aba = b. Show that (a) ®(D4) = {1, a7}, 


(b) ®(Diz ) C {1,a°}, and (c) ®(D,,) ={1}, where p#q are primes. [Hint: If 
H C (a) has index m, show that H = H U Hb is a subgroup of index m.] 


. If o(a)= py" ps? ---pr7, where p; are distinct primes, show that ©®[(a)] = (a’”), 


where m = p1p2°-+ Pr. 


. Let |G| = p°, where p is a prime. If G is nonabelian, show that ®(G) = G’ = Z(G) 


and that this subgroup has order p. [Hint: Exercise 26 §8.2.] 


. If G is a finite group, write ® = ®(G). Show that the following are equivalent: 


(1) G is nilpotent (2) G/® is abelian (3) G/® is nilpotent 
If K 4G, where G is finite, and 6(G/K) = {K}, show that ®(G) C K. 
(a) If G is finite, K a G, and K C ®(G), show that ®(G/K) = ®(G)/K. 
(b) Show that ®(G/®(G)) = {1}. 


27. 
28. 


29. 


30. 


31. 


9.8. Nilpotent Groups All 


Show that ®(G x H) = ®(G) x (4) for finite groups G and H. 

If G is a finite group and H < G, show that ®(H) C Hn ®&(G). [Hint: If 6(H) Z M, 
where M is maximal in G, show that G = ®(H)M and apply Lemma 1 §8.1.] 

(a) If G is a finite group and MCG is a maximal subgroup, show that either 
Z(G) C M or G' C M. [Hint: MZ(G) = G implies that M 4G] 

(b) Show that Z(G) NG’ € ®(G) for all finite groups G. 

Show that a finite group G can be generated by n elements if and only if the same 
is true of G/®(G). 

If G is a finite p-group, p a prime, show that ®(G) = (G'U {g? | g € G}). 


Chapter 10 


Galois Theory 


In most sciences, one generation tears down what another has built and what one has 
established another undoes. In mathematics alone, each generation adds a new storey 
to the old structure. 


—Hermann Hankel 


The moving power of mathematical invention is not reasoning but imagination. 


—Augustus de Morgan 


If E D F is an extension of fields, Galois theory studies the set of automorphisms 
o: & > E that fix F in the sense that o(a) = a for all a € F. The set G of all such 
automorphisms is.a group called the Galois group of FE over F. With appropriate 
restrictions on the extension & D F, we can establish a bijection (called the Galois 
correspondence) between the subgroups of G and the subfields of & that contain 
fF’. This correspondence is very useful in deducing properties of the subfields from 
properties of the corresponding subgroups and conversely.!1° 

The origins of Galois theory lie in the theory of equations. Methods implying 
the quadratic formula for solving x? + bx + c= 0 were known to the Babylonians 
in 1600 BC, but an algebraic formulation did not appear until the second century 
AD. As to cubics, nothing appears to have been done until the fifteenth century 
when Scipione del Ferro, and later Niccold Tartaglia, found what is now called 
the cubic formula. This result, together with Lodovico Ferrari’s formula for solving 
quartics, was published in 1545 in the book Ars Magna by the physician Girolamo 
Cardano. 


40This chapter requires only Sections 6.1-6.4 as background. The material on solvable groups 
needed in Section 10.3 is adequately reviewed there. 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


412 


10.1. Galois Groups and Separability 413 


After that the greatest mathematicians attempted to find a similar formula 
for expressing the roots of an arbitrary quintic in terms of coefficients by using 
only arithmetic operations and the extraction of nth roots (called radicals). Possi- 
bly the most important step was taken by Lagrange in 1770 when he unified the 
previous work by showing that, in every case, the solution depended on finding 
combinations of the roots of the equations that were unchanged when the roots 
were permuted. He showed that his method failed for the quintic, which aroused 
suspicion that a general formula was impossible in this case. A flawed proof of 
this impossibility by Ruffini appeared in 1813, and in 1824 Abel settled the matter 
once and for all: No general formula for the roots of a quintic exists that uses only 
radicals. 

The general problem of determining which polynomial equations could be solved 
by radicals was resolved in 1830 by a 19-year-old Frenchman Evariste Galois. He 
had submitted three papers to the Academy of Sciences in Paris, but all were 
rejected. Incredibly, he was killed in a duel in 1832, and it was not until 1846 that 
his work finally received the recognition it deserved. 


10.1 GALOIS GROUPS AND SEPARABILITY 


If E D F are fields, Galois theory is concerned with the automorphisms o: fF — E 
that fix F in the sense that o(a) =a for all a € F. In this case, o is called an 
F-automorphism of E. The identity automorphism ¢ certainly has this prop- 
erty, and one easily verifies that the set of F'-automorphisms is a subgroup of the 
group of all automorphisms of F. This group is called the Galois group of the 
extension E D F and is denoted gal(E : F). We focus on this group throughout the 
chapter. 


Example 1. gal(F : F) = {ce} for all fields F. 


Example 2. Show that gal(C : R) = {e,y}, where y: C > C is the conjugation 
automorphism defined by y(z) = 2 for all z € C. 


Solution. If o € gal(C : R) and z = a+ bi in C, then 
o(z) = o(a+ bt) = o(a) + 0(b) a(t) = a+ bali). 


But a(t)? = o(i?) = o(-1) = -1, so o(i) =1 or o(i) = —i. These conditions give 
o =€ oro =7, respectively. oO 


Throughout this chapter, we use terminology and notation for field extensions 
from Sections 6.1-6.4, usually without comment. For convenience, we restate three 
lemmas from Chapter 6 that will be referred to repeatedly. They come, respectively, 
from Theorem 4 §6.2, Theorem 3 §6.3, and Theorem 3 86.2. 


Lemma 1. If FE D F are fields, and u € E is algebraic over F' of degree n, then: 
(1) F(u) = {f(u) | f € Fle}. 
(2) {1,u,-++,u""} is an F-basis of F(u). 


Works F is a ring homomorphism, and if f = Sa,x* € F{s], recall that we 
define f° € F[z] by f? = Uo(as)a*. 


414 10. Galois Theory 


Lemma 2. Let o : F — F be an isomorphism 

of fields F and F and let p be a monic, irreducible __ } a B 
polynomial in F(x]. If u and v are roots of p and p® a ©) 
in extension fields of F and F’, respectively, then | | 
there is an isomorphism 


G6: F(u) — F(v), 


given by G[f(u)] = f7(v) for each f € F[z], which extends 0.14 


Lemma 3. If ED F is a field extension and u € E is algebraic over FE, let m 
denote the monic polynomial of least degree such that m(u) = 0. Then m is uniquely 
determined by u, irreducible, and satisfies f(u) = 0, f € Fa], if and only if m | f. 


The polynomial m in Lemma 3 is called the minimal polynomial of u over F, 
and the degree of m is called the degree of u over F and is written deg p(w). 


Let us return to Example 2. The fact that C = R(z) is essential in the solution. 
Indeed C contains all roots of the polynomial x? +1, and the key observation in 
the solution is this: Given o € gal(C : R), the fact that 7 is a root of 2” +1 implies 
that o(i) is also a root. This basic fact is recorded as part of Lemma 4 (the proof 
is Exercise 2). 


Lemma 4. If FE D F are fields, G= gal(E: F), we€ E, anda € G, then 

(1) olf (u)] = flo(u)] for all f € Pla}. 

(2) In particular, if u is a root of f, then o(u) is also a root of f. 

(3) Ifwu is algebraic over F’, and o, r € gal(F(u) : F), then o = 7 if and only if 

o(u) = 7(u). 

Let F be a field and let F(u) be a simple extension of F where u is algebraic over 
F. We want to determine gal(F'(u) : F). By (3) of Lemma 4, each o € gal(F(u) : F) 
is completely determined by the choice of o(u) in F'(u). But this choice is not 
arbitrary. If m is the minimal polynomial of u over F’, then m(u) = 0, so o(u) is 
also a root of m by Lemma 4(2). Moreover, if uj =u, ua,-+: ,u, are the distinct 
roots of m in Fu) then, by Lemma 2, an F-automorphism o; : F(u) > F(u) exists 
for each i such that o;(u) = u;. Theorem 1 sums up this discussion. 


Theorem 1. Let F(u) > F be a simple extension, where u is algebraic over F with 
minimal polynomial m € F[a]. If uy = u,ue2,::: ,u,r are the distinct roots of m in 
F(u), then 


gal(F(u): F) = {01 =€, o2,-++ , or}, 


where, for each i, o; is the unique F'-automorphism of F'(u) that satisfies o;(u) = uj. 
Hence if m splits in F(u) and has distinct roots then |gal(F(u) : F)| = [F(u) : F}. 


Proof. It is clear that {o;|1<%i<r}C gal(F(u): F), and the other inclusion 
follows from the above discussion. 


Example 3. If u = */2, show that gal(Q(u) : Q) = {e}. 


111That is, (a) = o(a) for allae F. 


10.1. Galois Groups and Separability 415 


Solution. Here, m = x° — 2 is the minimal polynomial of u over Q. The roots of m 
in C are u, uw, and uw”, where w = e?7*/3, so u is the only root in Q(u). Thus, any 
o in gal(Q(u) : Q) must satisfy o(u) = u, from which o = «. 


Example 4. If u = e?"*/5, write G = gal(Q(u) : Q). Show that G & Cy. 


Solution. The roots in C of 2° —1 are 1,u,u?,u%, and u*, and they are distinct. 
Now 2° —1=(x—1)®s(x), where ©5(z) =1+a2+2?+2°+24 is the fifth 
cyclotomic polynomial. This is Q-irreducible by Example 13 §4.2 and so is the 
minimal polynomial of u. Hence, the roots of ®5(r) in C are u,u?,u, and w+, 
and they are distinct and all lie in Q(u). It follows that |G| = 4 by Theorem 1. 
By Lemma 2, o € G exists with o(u) = u?. Then o?(u) = o(u?) = o(u)? = ut, 
so o°(u) = o(u*) = u8 = u®. Thus €,0,07, and o° are distinct by Lemma 4(3), so 
G = {e,0,07, 0°}. Clearly, G & C4. gO 


There is nothing special about the prime 5 in Example 4. Indeed, if p is any 
prime and u = e?7*/?, the same argument shows that the minimal polynomial of u is 
®,(v)=1l+a2+---+ z?-! and that this polynomial has distinct roots u,u?,--- , 
u?-! in Q(u). Hence, Theorem 1 shows that G = gal[Q(u) : Q] satisfies |G] = p — 1. 
By Theorem 7 §6.4, Z> is cyclic, say Zi = (m) = {1,m,m?,--- ,m?-?}. By Lemma 
2, there exists o € G satisfying o(u) = u™. Now let 7 € G be arbitrary. Then r(u) 
is a root of ®,(z), say T(u) =u*, where 1<k <p—1. Thus, k=m! (mod p), 
where 0 <t < p—2,s0 u* = u™ because |u| = p. In other words, r(u) = o*(u), so 
Tt =o" € (co) by Lemma 4(3). Hence, G C (a), so, since |G| = p — 1, this shows that 
G = (co) = Ch_1. We record this as Example 5. 


Example 5. If p is a prime and u = e?**/?, then gal(Q(u) : Q) & Cp-1. 
Example 6. Let E = GF(p"), where p is a prime, and regard Z, as a subfield of 
E. Show that gal(E : Zp) = Cy. 


Solution. Write G = gal(E : Z,). Corollary 1 of Theorem 7 §6.4 gives B = Z,(u) 
for some u € LE, so |G| < n by Theorem 1 because the minimal polynomial of u over 
Zy has degree [FE : Zp| =n. On the other hand, let 0: HE E be the Frobenius 
automorphism defined by o(w)=w? for all we E. Then o €G by Fermat’s 
theorem, and it suffices to show that o(o) =n. One verifies that o*(w) = w?* for 
all k > 1. Hence, if o* = e, then every element of F is a root of a?’ — 2. As |E| =p”, 
this condition implies that k > n. Hence k = n, as required. 

Theorem 1 gives a lot of information about the Galois group of a simple algebraic 
extension, a situation that occurs commonly (see Theorem 6). However, 
many of the techniques used to prove it apply to any finite field extension H D F; 
that is, [Bf : F] =dimy & is finite. Recall that ED F' is finite if and only if 
E = F(uy,ua,--+ , Un), where each u; € E is algebraic over F (Theorem 6 86.2). 


Theorem 2. Let E = F(ui,-+- ,Un) D F be a finite extension where, for each 1, 
u,; is algebraic over F with minimal polynomial m,. If o € gal(E: F), then 

(1) o is uniquely determined by the choice of o(uy),+++ ,o(un) in E. 

(2) o(u;) is a root of m; for each i. 

(3) Ifo,7 € gal(E: F), then o =7 if and only if o(u;) = T(u;) for each i. 
In particular, gal(E : F) is a finite group. 


416 10. Galois Theory 


Proof. (1) This follows from (3). 

(2) We have m,[o(ui)] = o[mi(u;)] = o(0) = 0 using Lemma 4. 

(3) Let o(uj;) = 7(ui) for each i, where 7 € gal(H: F); we must show that 
o=T. Writing \=771o, it suffices to show the following: If \ € gal(E: F) 
satisfies \(u;) = u; for each i, then A = e. We prove this by induction on n. Ifn = 1, 
it is Lemma 4(3). If n> 2, write K = F(u,). Then » fixes K, and it follows 
that A € gal(K(ua,-+: ,Un): K). Thus, \=e by induction. The last sentence 
follows from (1) and (2). | 


All the Galois groups that we have constructed so far are abelian. However, this 
is not the case in general; indeed, every finite group can be realized as a Galois 
group (Corollary 2 of Theorem 3 §10.3). For now, however, we content ourselves 
with constructing a nonabelian example using Theorem 2. 


Example 7. If E is the splitting field!” of x? —2 over Q, show that gal(E : Q) & Ds. 


Solution. Write G=gal(E:Q), u= 72, and w=e?"/3, Then the roots 
of z?—2 are u,uw, and uw?, so F = Q(u,uw, uw”) = Q(u,w). The minimal 
polynomials of u and w over Q are 2?—2 and z*+2+1, respectively, and 
x*+2+1 has roots w and w? in E. Thus, for o € G, Theorem 2 shows that 
o(u) € {u,uw,uw?} and o(w) € {w,w?}, and hence that |G|<3x2=6. On 
the other hand, a Q-isomorphism 

09: Q(u) — Q(uw) exists by Lemma 2 

with oo(u) = uw (see Theorem 3 §6.3). EB =Q(u,w) —2+ Q(uw,w) = E 
This isomorphism in turn extends (by 

the same theorem) to an automorphism | 


og of B= Q(u)(w) = Q(uw)(w) with oo 
o(w)=w (see the figure). Thus, Qu) Q(uw) 
o€G © satisfies o(u)=uw and | 

o(w)=w. Similarly, 7EG can be : 
constructed such that r(u)=wu and Q Q 


T(w) = w”?. It is a routine matter (using 
Theorem 2) to verify that o(7) = 8, o(r) = 2, and oro =r. Thus, (o,7) & D3, so, 
because |G] < 6, G = (0,7). 0 


Separable Extensions 


Let G = gal(E: F), where E is the splitting field of a polynomial f over F, and let 
X denote the set of distinct roots of f in FE. Ifo € G, then o(u) € X for allu € X, 
so we have the restriction map o|x : X — X defined by 


o|x(u)=o(u)  forallue X. 


Then o|x € Sx because o is one-to-one and X is finite, and o ++ o|x is a group 
homomorphism that is one-to-one by Theorem 2. Hence, we can view G as a group 
of permutations of X. The following terminology is standard. A group G of permu- 


112Splitting fields are discussed in detail in Section 6.3. 


10.1. Galois Groups and Separability 417 


tations of a set X is said to act transitively on X if, for all u,v € X, there exists 
oa €G such that o(u) =v. 


Theorem 3. Let G = gal(E: F), where E is the splitting field of a polynomial f 
over F, and let X denote the set of distinct roots of f in E. Then 

(1) G is isomorphic (by restriction) to a subgroup of Sx. 

(2) If f is irreducible in F[z], then G acts transitively on X. 


(3) If f has no repeated root in E, and G acts transitively on X, then f is 
irreducible in F[z]. 


Proof. (1) This follows by the discussion preceding this theorem. 

(2) If u,v € X, then £ is the splitting field of f over F(u) and also over F(v). 
Hence, Lemma 2 gives an F-isomorphism oo : F(u) — F(v) with oo(u) = v. This 
isomorphism extends to o € G by Theorem 4 86.3. 

(3) Suppose that f=gh in F[z]; g,h not constant. Let g(u) =0 = h(v), 
where u,v € X. Because G acts transitively, let v = o(u), where o € G. Then 
g(v) = glo(u)] = o[g(u)] = 0, so v is a repeated root of f, contrary to hypothesis.@ 


If FE D F isa finite extension of fields, we want to determine the size of the Galois 
group G = gal(£: F). If H = F(u), where u € E is algebraic over F, Theorem 1 
shows that |G| is the number of distinct roots in E of the minimal polynomial of wu. 
If f = F(uj,-++ , Un) and m;, is the minimal polynomial of u; for each 1, Theorem 2 
shows that o € G is determined by its effect on the roots of these polynomials m,. 
To count these automorphisms, we adopt a different perspective. 

We assume that F is the splitting field!’ of a polynomial f in F[z]. We are 
going to prove that if every irreducible factor of f has distinct roots in B, the 
Galois group G = gal(£: F) has order |G| = [E : F']. Examples 5 and 7 illustrate 
this. The next result provides a simple test for when an irreducible polynomial has 
distinct roots. The test involves the formal derivative f’ of a polynomial f, defined 
in Section 6.4 as follows: If 


f=aotayrt+-:-+anz”, then fl =a, + 2agz +--+ +nayz™?, 
The usual properties of derivatives remain valid (Theorem 2 §6.4). 


Lemma 5. If F is a field, the following conditions are equivalent for an irreducible 
polynomial p in F [a]. 

(1) p bas distinct roots in every extension field of F in which it splits. 

(2) p has distinct roots in some splitting field of p over F. 

(3) p’ #0. 
Proof. (1) => (2). This is clear. 

(2) = (3). Let B D F be a splitting field for p over F and let p(u) =0, ue E. 
If p' = 0, then x — u divides both p and p! and so (x — u)? divides p by Theorem 3 
§6.4, contrary to (2). So p’ #0. 

(3) = (1). Suppose that p splits in # D F and assume that u € E is a repeated 
root of pin E. Then (x — u)? divides p in Elz] and so (x — u) divides both p and 


113Not every finite extension is a splitting field. For example, Q( 7/2) is not a splitting field of any 
polynomial in Q[z]. We discuss this topic further in Section 10.2. 


418 10. Galois Theory 


p' in Ez] by Theorem 3 §6.4. But p and p’ are relatively prime—p is irreducible 
and does not divide p’, so (2 — u) divides 1 in E[z], a contradiction. | 


If F is a field, an irreducible polynomial pin F'[z] is called separable over F if 
it satisfies the conditions in Lemma 5, and a polynomial f € F|z] of positive degree 
is called separable over F (or separable in F'[z]) if all its irreducible factors are 
separable. An extension & D F of fields is called a separable extension if it is 
algebraic and the minimal polynomial of each element of F is separable over F’. 


Example 8. The irreducible polynomial p = x? + 2 is separable over Q because 
p #0, or because its roots +i/2 in C DQ are distinct. Hence, the polynomial 
at +42? +4 = (2? +2)? is also separable over Q. 


Example 9. Show that f = 2° — x? — 1 is separable over Z3. However, f’ = 0. 


Solution. We have f =p*, where p= 2? —x—1 is irreducible over Z3. Hence, 


it suffices to show that p is separable. But p is separable by Lemma 5 because 
p' = 22 —1 #0. However, f’ = 0 because char Z3 = 3. Oo 


Let f = a9 +a12 +agu” +---+a,yz* +--+ be a polynomial in F[z]. When the 
formal derivative f’ = 0 depends on the characteristic of the field F, we have 


fl = a1 + 2age +++ + kagek t+... 


so f/=0 if and only if ka, =0 for all k>1. If charF' =0, this implies that 
ay, = 0 for all k > 1, so f = ao is constant (as in calculus). However, if char F = p 
is a prime, then f’ =0 implies that a, = 0 whenever p does not divide k, that 
is, when f = g(#”) for some polynomial g in F[z]. Conversely, Theorem 2 §6.4 
gives [g(x?)|’ = g'(x?)(pa?-1) = 0 when the characteristic is p. With Lemma 5, this 
observation gives Theorem 4. 


Theorem 4. Let f be an irreducible polynomial in F |x], where F is a field. 
(1) If char F = 0, then f is separable over F. 
(2) If char F = p, then f is separable over F if and only if it is not of the form 
f = 9(z?) for some polynomial g € Fz). 


Corollary. If char F = 0, every algebraic extension of F' is separable. 


Our goal is to show that if # D F is the splitting field of a separable polynomial 
in F[z], then the Galois group has order [FE : F']. It is convenient to prove slightly 
more. Suppose that o : F + F' is an isomorphism of fields. If f = U_paiz* is a 
polynomial in F[z], recall that f° € F[z] is defined by f?(x) = U™po(a;) x*. If 
EDF and E D F are splitting fields of f and f7, respectively, then Theorem 4 §6.3 
asserts that an isomorphism &¢ : EF — E exists that extends o (that is, (a) = o(a) 
for all a € F). If f is a separable polynomial, we can count such extensions. 


Theorem 5. Let o: F > F be an isomorphism of fields and let f € F[z] is a 
separable polynomial. If EK 2 F and EDF are splitting fields of f and f?, 
respectively, there are exactly [E : F] isomorphisms 6 : FE — E that extend o. 


Proof. Use induction on [£: F]. If {#: F]=1, thn B=F and f splits in 
Fa]; that is, f = a(# —a)---(t—apn), where a,a; € F. Since fH f?% is a ring 


10.1. Galois Groups and Separability 419 


homomorphism, f7 = o(a)(xz — o(a1))---(z — o(an)) splits in F. This means that 
E = F and the only extension is 6 =o. 

If [E:F]>1, then f does not split in a 
Fz], so let p be an irreducible factor of f with | 


EB 
deg p = k > 2, Fix a root u € FE of p. Then any 

isomorphism 6 : E — FE induces an isomorphism 7 
T:F(u) — K, where K = 6[F(u)] is a subfield Flu) 
of & containing F (see the diagram). Obviously, 
& extends + and 7 extends o. Hence, the number 
of possibilities for @ equals the number of exten- F 
sions 7 of o times the number of extensions 6 of 

7. Now the multiplication theorem gives 


[E:F] _|[E:F| 
[F(u):F] ok 


x 


Ay 


[EB : F(u)] = 


<[E: F). 


Moreover, EF is the splitting field of f over F(u), and f remains separable over 
F(u) because any irreducible factor of f in F(u){z] must divide an irreducible 
factor of f in F(x]. Hence, by induction, the number of extensions of r to E is 
[E: F(u)] = [E: F]/k. So it remains to show that there are exactly k one-to-one 
ring homomorphisms 7 : F(u) + FE that extend o. 

But f +> f? is a ring isomorphism F[z] — F{[z], so p? is irreducible of degree k 
in Fla]. Moreover, p' # 0 (f is separable by hypothesis) so (p’)' = (p')” # 0. Thus, 
p° has m distinct roots 1,:-: ,Um in E, and Theorem 3 §6.3 shows that, for each 
i, an isomorphism 7; : F(u) > Fv) exists that extends o and satisfies 7;(u) = ¥%. 
Hence, {71,72,:** , 7m} are distinct extensions of o to F(u). But if 7 is any such 
extension, then p?[r(u)] =7[p(u)] =7(0) =0, so r(u) =v; =7;(u) for some i. 
Hence, 7 = 7, which completes the proof. a 


If we take F = F, E = EB, and o = « in Theorem 5, we obtain 


Corollary. Let E > F be a splitting field of a separable polynomial in Fa]. If 
G=gal(E: F), then |G| =[E: F). 


The corollary will be used several times below. In fact, extensions # D F', where £ 
is the splitting field of a separable polynomial over F’, will occupy much of our effort 
in Section 10.2. We conclude this section with the surprising fact that every finite 
separable extension is simple. 


Theorem 6. Primitive Element Theorem. Let ED F be a finite separable 
extension. Then E is a simple extension of F; that is, EH = F(u) for some u € E. 


Proof. If F' is a finite field, then F is also finite, so the unit group E* is cyclic by 
Theorem 7 §6.4, say E* = (u). Hence, F = F(u). 

So assume that F is infinite. By Theorem 6 §6.2 (and induction), we may assume 
that H = F(v,w). Let p and qg be the minimal polynomials over F for v and w, and 
let vy = VU, V2,++* ,Um and w1 = Ww, W2,--+ ,Wn, respectively, be the roots of p and 
q in E. The variables v; are distinct because p is separable (take a splitting field of 
p containing £). Similarly, w; are distinct. As F is infinite, a € F exists such that 


Vi- VU 
ax — 
WW; 


for alli and all 7 #1. 


420 10. Galois Theory 


If w=v+aw, then F(u) C F(v,w) = E, and we claim this is equality. For this 
it suffices to show that w € F(u). Write K = F(u) for convenience, and let m 
be the minimal polynomial of w over K. Then it suffices to show that we K 
or, equivalently, that m is linear. Now mlg (because q(w) = 0), so m is the product of 
some of the factors z — w;. On the other hand, define f = p(u — ax) € K[x]. Then 
f(w) = p(v) = 0, so m|f. However, f(w;) #0 for all 7 #1 by the choice of a and 
u, so m(w,;) #0. Thus m = 2 — w, as required. a 


Corollary. If F has characteristic 0, any finite extension of F is simple. 


The proof of the primitive element theorem actually gives an algorithm for 
finding a generator of the extension. Here is an example. 


Example 10. Let F=Q and E= Q(V2, V5). In the notation of the proof 
of Theorem 6, we write v= /2 and w=¥/5, so the minimal polynomials are 
p=a? ~2Qandq=2?—5. Then vy, = V2, vo = —V2, w, = V5, and w, = —V5, so 
the quantities (v; — v)/(w — w;) in the proof reduce to 0 and —/2//5. If we choose 
a = 1, the proof gives E = Q(/2 + V5), as we showed directly in Example 15 §6.2. 


Exercises 10.1 


Throughout these exercises, # and F are assumed to be fields. 


1. Prove that gal(E# : F) is a group for any field extension EF D F. 
2. Prove Lemma 4. 
3. If BD F and {u1,--- ,U,z} is an F-basis of E, show that o € gal(F: F) is uniquely 
determined by the choice of o(u1),--+ ,o(un). 
4. If ED F and ué€ E, show that gal(E: F(u)) = {o € gal(E: F) | o(u) = u}. 
5. If ED Q, show that gal(E : Q) = aut E. 
6. If E = Q(e?™*/®), compute gal(E : Q). 
7. If E = Q(e?**/®), compute gal(E : Q). 
8. If FE = Q(v2, V3), show that gal(F : Q) & C2 x Cy. [Hint: Lemma 21] 
9. If EF = Q(i, V3), compute gal(E : Q). 
10. (a) If F = Q(Y2), show that gal(E : Q) & Cy. (Hint: Lemma 2.] 
(b) Why does (a) not contradict the corollary to Theorem 5? 
11. If [#: F] = 2, show that gal(E: F) = Cy. 
12. Let E be the splitting field of f = 2° + a3 —1 over Z3. Show that FE = Zs(u) is a 
simple extension and find gal(F : Zs). [Hint: (a+b+c)§ =a° +6343 in Z3\| 
18. If E = Q(¥2,7), show that gal(E: Q) & Dy. [Hint: If u= 2, find o and + in 
gal(E : Q) such that o(u) = iu, o(t) =7, r(u) = u, and r(i) = —71 
14. Let E be the splitting field over Q of z” — 1. Show that gal(F : Q) is abelian. 
15. Use the method of Example 10 to show that F = Q(u) if 


(a) E= Q(v3, v5) (b) E = Q(i, v5) 
16. (a) Show that Q(/p, /@) = Q(./p + a), where p and g are distinct primes. [Hint: 
Example 10.] 


(b) Show that Q(/p, /a, Vr) = Q(./p+ a+ vr), where p,g and r are distinct 
primes. [Hint:; Exercise 32 §6.2.] 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24, 


25. 
26. 


27. 


28, 


29. 


30. 


10.1. Galois Groups and Separability 421 


Show that gal(R : Q) = {e}. [Hint: If u < v in R, show that o(u) < o(v) for all o in 
gal(R : Q) because v —u = w?, w ER. If u < o(u), choose a € Q such that u<a< 
o(u).] 

Let u = e™*/1, where g is an odd prime. Show that gal(Q(u) : Q)= Cq-1- [Hint: Show 
that Q(u?) = Q(u),] 

Let B= F(u,,u%2,-*,U,), Where each u; is algebraic over F. If 0,7 € gal(E: F) 
satisfy o(u;) = 7(u,) for each i, show that o = T. 

Let F = K(t) denote the field of rational forms over a field K in an indeterminate t. 
Show that x? — ¢ is irreducible over F but is not separable if char K = 2. 


Let F(t) denote the field of rational forms over a field F. Given M = be Hi in 


at+b 
ct+d 
an onto group homomorphism GL2(F) — gal[F(t) : F] with kernel 


Z|GL(F)| = { fe o|jo#ee FI. 


(a) Show that the following are equivalent for a polynomial f in F[z]. 

(1) f has no repeated root in any extension field of F. 

(2) f has no repeated root in some splitting field over F. 

(3) f and f’ are relatively prime in F[z]. 
(b) If f is as in (a), show that f is separable, but not conversely. 
If n > 2, show that f = 2" —z € Fa] has no repeated root in any splitting field if 
either char F' = 0 or char F = p and p does not divide n — 1. [Hint: Exercise 22.] 
If char F = p and F contains n distinct nth roots of unity, show that p does not divide 
n. [Hint: Exercise 22.] 
If HD F and f € F{z] is separable over F, show that f is separable over E. 
If HD KDF and EDF is a separable extension, show that both EDK and 
K 2D F are separable extensions. [Remark: The converse is true if [F : F'] is finite—see 
Exercise 31.] 
Let F have characteristic p. If f = x? — a, where a € F, show that f is irreducible or 
a power of a linear polynomial. [Hint: Lemma 5 and Theorem 4.] 
(a) Show that the following are equivalent for F (then called a perfect field): 

(1) Every algebraic extension of F is separable. 

(2) Every finite extension of F' is separable. 

(3) Every irreducible polynomial in F'(:] is separable. 
(b) Show that every field of characteristic 0 is perfect. 
(c) Show that every algebraic extension of a perfect field is perfect. 
(a) Let F be a field of characteristic p. Show that F is perfect (Exercise 28) if and 
only if every element b € F has the form b = a? for some a & F. [Hint If F is perfect 
and a € F, consider the irreducible factors of x? — a in some splitting field. For the 
converse, use Theorem 4.] 
(b) Show that every finite field is perfect. 
Let # D F be a finite extension, where char F = p. 
(a) If u¢ E has a separable minimal polynomial q over F, show that u © F(u?). 
[Hint: If m is the minimal polynomial of u over F(u?), show mg and m|(a —u)? J 
(b) Define F(Z?) = {a,uf +--+ +a,u? | a; € F, us € BE, n> 1}. Show that F(£?) is 
a subfield of E. [Hint: Exercise 35 §6.2.] 


GL2(F), define oy : F(t) — F(t) by oy[A(t)] =A . Show that Mh oy is 


422 10. Galois Theory 


(c) If H = F(E?) and {wi,--+ ,we} C E is F-independent, show that {w?,-+- , wR} 
is F-independent. [Hint: Extend to a basis {wi,:-+,Wk,:'+,Wn} of H, show that 
{w?,--> ,we,-:+ ,w?} spans FE, and apply Theorem 7 §6.1,] 
(d) Show that EF D F is separable if and only if F(E?) = E. (Hint: lf E = F(E®), use 
Theorem 4 §6.2 and (c).| 

31. Let E D K D F be fields with [E : F] finite. Show that E D F is separable if and only 
if both FE D K and K D F are separable. [Hint: Exercises 26 and 30.] 

82. If E D F is a finite extension, then u € E is called a separable element over F if 
its minimal polynomial in F'[2] is separable. 
(a) If u € E& is separable over F and ED K D F, where K is a field, show that u is 
separable over K. [Hint: Exercise 30(d).] 
(b) Show that u€ E is separable over F' if and only if F(u) D F is a separable 
extension. 
(c) Define S = {u € EF | u is separable over F’}. Show that S is a subfield of FE, that 
S > F is separable, and that E D K D F, with K D F separable, implies that S D K. 
The field S is called the separable closure of F in E. (Hint: If u,v € S, show that 
F(u,v) D F is separable by (a) and Exercise 31.] 


10.2 THE MAIN THEOREM OF GALOIS THEORY 


The central theme of Galois theory is to analyze a field extension F D F by studying 
its Galois group G = gal(E : F). It turns out that a beautiful correspondence exists 
between the subgroups A of G and the intermediate fields K with E D K D F. This 
correspondence was first noticed by Galois in his study of the roots of polynomials, 
published in 1846, but it was not until 1894 that Richard Dedekind first formulated 
the theory in terms of field extensions. We begin with two of Dedekind’s theorems 
on field automorphisms in the form given in 1942 by Emil Artin in his definitive 
account of the subject.114 The first of these results is more general than needed here, 
but the additional generality involves little extra effort, improves the exposition, and 
introduces the concept of a group character, which is important in the theory of 
group representations. 

Let G be a group and let E be a field. A group homomorphism o : G — E”* is 
called a character of G in E. A set {o1,--+ ,on} of characters of G in F is called 
independent!" if, given w1,-++ , Un in E, 

ur o1(g) + U2 92(g) +--+ +Un On(g) = 0 for allg EG 


implies that wy = ug = ++: = Un = 0. 


Lemma 1. Dedekind’s Lemma. Let {01,:::,on} be a finite set of distinct 
characters of a group G in a field E. Then {01,--+ ,o,} is independent. 


114 Artin, E., Galois Theory, 2nd ed., Notre Dame Mathematical Lectures No. 2, University of 
Notre Dame, 1964. 

115This property is independent in the vector space V of all mappings a : G — E, where addition 
and scalar multiplication are defined by (0 +17)(g) =o(g)+7(g) and (uo)(g) =uo(g) for all 
g€G,allo,7 EV, andallue E. 


10.2. The Main Theorem of Galois Theory 423 


Proof. For simplicity, write o;(g) = oig for each ¢ and all g € G. Proceed by induc- 
tion on the number n of distinct characters. If n = 1, then uj o1g = 0 for allg eG 
implies that u; = 0 because o1g # 0. If n > 1, assume that 


Uy 01g + Ug Cog +::-+Un Ong = 0 for all g EG. (*) 


We must show that u; =0 for all 7. If not, we may assume (by induction) that 
ui # 0 for all 7. Given h € G, replace g by gh in (*) and use the fact that o; are 
homomorphisms to get 


uy 01g a1h+ us gegoah+:+++Unongonh=0 for allg eG. Ce) 


If (*) is multiplied by oih and the result is subtracted from (**), the first terms 
cancel and the result is 


ue(ogh — oh) cog +++: +Un(onh — oh) ong =0 for all g EG. 


Thus, u;(oih ~ o1h) = 0 for all ¢ > 2 by induction, which yields o;h = oyh because 
u; # 0. Because this is true for all h € G, it implies that 0; = oj for each i, contrary 
to the hypothesis that they are distinct. El 


For Galois theory, the most interesting use of Lemma 1 arises as follows: If 
o:E& — E is an automorphism of the field &, the restriction of o to the group E* 
of units of F is a group homomorphism -* — E* and so is a character of E* in FE. 
This gives the following corollary . 


Corollary. Any finite set of automorphisms of a field EF is independent. 


If EH D F is the splitting field of a separable polynomial in F{z], then the 
corollary to Theorem 5 §10.1 gives |gal(# : F’)| = [E: F']. Dedekind’s lemma gives 
us half of this for any finite extension. 


Theorem 1. Let E > F be a finite extension of fields. If G = gal(E’': F’) denotes 
the Galois group, then 
IG| < [BE : F]. 


Proof. Write [E : F]| =n and let {vi,---,un} be an F-basis of &; we must show 
that |G| <n. If |G] >1n, let 00,01,:++ ,on be distinct elements of G and write 
oi(g) = oxg for g € G as before. Since each o; fixes F, it is F-linear in the sense 
that o;(av) = ao;(v) for alae F andve E. 

Now consider the following set of n equations in n+ 1 variables 2p, 71,--- ,%n: 


TovV1 Lo + 04V, 1 + +++ + Oni Ln = 0, 
T0V2 Lp +0402 01 + +++ + Onv2 fn = 0, 


TOUn LO + O1Vyn Vy ++ + OnUyn Ln = 0. 


Because there are more variables than equations, a solution x; = u; € H# exists 
where wu; #0 for some j. Thus, 


nr . 
Dojo THVi Uj =O fori =1,2,-+-,n. 


A424 10. Galois Theory 
Given u € E, write u = S7y_, aiv;, a; € F. Since each o; is F-linear, we get 


Tm Tr nr nm 117 : nm 
Fwy oju= 3 uy ( 0:03) = 3a SD Uy O50; = >>4a,0=0. 
j=0 j=0 i=l i=1 j=0 i=1 

This is a contradiction because the o; are independent by the corollary to 
Dedekind’s lemma. B 


No algebraist could resist the temptation to discover when equality holds in 
Theorem 1. To study this-question, we need a concept that reflects Artin’s point 
of view that the basic object of study in Galois theory is a field FE, together with a 
group G of automorphisms of £. In this case, write 


Eg ={ué E| o(u) =u for allo € G} = {ue E | G fixes u}. 


It is easy to verify that Hg is a subfield of FE, called the fixed field of G in E. Note 
that G C gal(& : Eg). If G is finite, we have the following fundamental result that, 
although stated originally by Dedekind, has become known as the Dedekind—Artin 
theorem. 


Theorem 2. Dedekind—Artin Theorem. Let E be field and let G be a finite 
group of automorphisms of E. Then [E: Eq| is finite and 
[E : Eg] = IG. 


Proof. Write Eg = F and |G| = n. If [E': F] is finite, then n < [E : F] by Theorem 1 
because G C gal(E : F). Hence, the proof is completed by showing that n < [EF : F| 
leads to a contradiction. In this case, let {ug,u1,-++ , Un} C E be independent over 
F’. Consider the following set of |G| = n equations in n + 1 variables 79, 21,---: , Zn 
where, once again, we write o(u;) = ou; whenever o € G. 


OUg Lo + OUy Li ++ ++ + otn Ln =O, a eG. (*) 


‘Because there are more variables than equations, there is a solution with not all 
variables zero. Among all such solutions, choose one with the smallest number 
r +1 of nonzero values. By relabeling variables if necessary, we may assume that 
Lo = Vo,"++ , Lp = Vp are these nonzero values and (multiplying by ug *) assume 
further that vp = 1. Then (*) becomes 


ouUg + ouUz Vy +++ + ottp vp = 0, a eG. (a) 


Taking o =e gives up tuiti +-++:+uU,vp =0, so, as uz are F-independent, 
vu, € F for some k <r. By the definition of F = Eg, ru, # up, for some 7 € G. 
Apply 7 to equations (**) to get 


TOUQ + TOU TV, +++ + TOU, TU = 0, a €G. 


Because To runs through the entire group G as o does, these equations, written in 
a different order, take the form 


OUp + OU, TV, + +++ + OU, TUp =O, a €G, rat) 
Now subtract (***) from (**) to get 


oui(v, — TV) +++ + ou,(v, — Tur) = 0, o EG. 


10.2. The Main Theorem of Galois Theory 425 


As uz, — Tv, #0, this gives a nontrivial solution to (*) with at most r nonzero 
values, contradicting the choice of r. B 


Example 1. Let u=e?™/>, F = Q(u), and F=Q. If G = gal(E: F), we showed 
in Example 4 §10.1 that G = (c) = C4, where o is defined by o(u) = u?. Thus, 
[E : Eg] =4 by the Dedekind—Artin theorem. However, the minimal polynomial 
of uis l+a+a2+2°+24, so [E:Q)=4. As QC EGCE, this implies that 
Eg = Q; that is, the only elements of F fixed by G are the elements of Q. 

Now consider H=(o*) and compute Ey ={w€ E|o?(w) =w}. Note 
that {1,u,u’,u®} is a Q-basis of H and that o?(u*) =u** for each k (because 
o*(u) = u*). Ifw =a+ but cu? + du? is in Ey, this gives 


w = 07(w) =at but + cu8 + dul? =a4+)(-1—u-— vu? —u5) + cu3 + du?. 


Then equating coefficients of the powers of u implies that b=0 and d=c, so 
w=a-+tc(u? +u3). Thus, [Eq : Q|=2, and so [E: Ex|=[E: Q|/[Ex : Q)=2=|HI, 
as the Dedekind—Artin theorem asserts. O 


Galois Extensions 


Fix a particular field extension F D F and write G = gal(E : F). A field K such that 
EDK D F is called an intermediate field of the extension. The heart of Galois 
theory is the observation that these intermediate fields are intimately related to the 
subgroups of the Galois group G. Indeed, if K is an intermediate field, gal(E : Ix) 
is a subgroup of G denoted, for convenience, by 


K'=gal(E: K)={o0 €G| o(u) =u for allue K} ={0 €G|o fixes K}. 


Conversely, for a subgroup H of G, the fixed field Hy of H in F is easily verified to 
be an intermediate field of the extension and is denoted for our present purposes as 


H° = Ey ={u€ E|o(u) =u forallo € H} ={uce E| u is fixed by H}. 
The basic properties of these constructions are collected in Lemma 2. 
Lemma 2. Let EDF be fields and write G=gal(E: F). Let K and K, be 
intermediate fields and let H and H, be subgroups of G. Then 

(1) If K C Ky, then K’ > Kj. 

(2) IfH C Mj, then H° D A. 

(3) K CK” and K” = {ue E|lifo € G fixes K then o fixes u}. 

(4) HCH” and H” = {o €G |ifu € E is fixed by H, then u is fixed by o}. 

(ek 

(6) H° au Freie. 
Proof. (1) and (2) are immediate consequences of the definition, as are the descrip- 


tions of K’” and H® in (3) and (4). These descriptions imply that K C K’” and 
H C H™, proving (3) and (4). Now K’ D K’” and H° D H°” by (1) and (2). But 


K’!=(K)"DK' by (4) and =H? = (H°)° D H® by (3). 


This proves (5) and (6). | 


426 10. Galois Theory 


By virtue of these properties, the maps K ++ K’ and H + H? are called a Galois 
connection. The most interesting case is when these maps are mutually inverse 
bijections and we characterize the extensions for which this happens in a moment. 
However, we first need to say something about the most general case. 

Following Irving Kaplansky,1"° it is convenient to call H° and K” the closures 
of H and K, respectively, and to call H and K closed if H = H” and K = K”, 
respectively. Thus, (5) and (6) of Lemma 2 assert that K’ and H° are always closed, 
which leads to Lemma 3. 


Lemma 3. Let E D F be fields and let G = gal(E : F). Then 
Kw kK! and Hw H° 


are mutually inverse, order reversing bijections between the set of closed interme- 
diate fields K of the extension E D F and the set of closed subgroups H of the 
Galois group G. 


Proof. These maps are defined because K’ and H° are closed, they are order 
reversing by (1) and (2) of Lemma 2, and they are mutually inverse bijections 
because K” = K and H” = H whenever K and H are closed. | 


This result is slick, but it is not very useful unless we have a good idea about 
which intermediate fields and which subgroups are closed. To motivate the discus- 
sion, view the effect of the’ and ° operations as shown in the diagram, where E D F 
are fields and G = gal(E: F). 


E fe} E {e} 
Ce" Pe 
F G F G 


Applying the operations at the tops and bottoms of these diagrams, one sees 


E'={e} and {e}>=E, 
F’=G and GOoODF. 
The anomaly G° D F begs for attention. It need not be equality: If F = Q and 
E = Q(/2), then G = {ce} by Example 3 810.1, so G° = E D F. However, we do 
have some useful conditions when equality happens. 
Lemma 4. If F D F are fields and G = gal(E:: F), the following are equivalent. 
(1) @=F. 
(2) F is closed. 
(3) The only elements of E fixed by each o € G are the elements of F. 


16}aplansky, I., Fields and Rings, 2nd ed., Chicago: University of Chicago Press, 1972. 


10.2. The Main Theorem of Galois Theory 427 


Proof. As F’=G, we have F’° =G°, so (1) = (2). Finally, (1) = (3) follows 
because G° = Eg = {u € E | o(u) = u for all o € G}. a 


A field extension E D F is called a Galois extension if the conditions in Lemma 
4 are satisfied. Hence, & D F is Galois if and only if 


Given u € E with u ¢ F, there exists o € gal(E: F) such that o(u) # u. 


Example 2. C 2 R is Galois because gal(C : R) = {e,y}, where y is conjugation 
(Example 2 §10.1), and the only complex numbers fixed under conjugation are real. 


Example 3. If E = Q(+/2), then EF D Q is a finite extension that is separable (as 
char Q = 0) but is not Galois. Indeed, G = gal(E£ : Q) = {e} by Example 3 §10.1, 
so every element of FE is fixed by G. 


Galois extensions have been defined very abstractly. Hence, Theorem 3 is fun- 
damental because it characterizes them in terms of splitting fields and separability. 


Theorem 3. The following conditions are equivalent for a finite field extension 
ED F with Galois group G = gal(E: F). 
(1) # DF is a Galois extension. 
(2) Each irreducible polynomial in F'|x] with a root in E is separable and splits 
in E[a]. 
(3) E is the splitting field of some separable polynomial in F[z}. 
_ In addition, E 2 F is a separable extension. 


Proof. (1) = (2). Let p be irreducible in F[z] and let p(u) =0, ue E. Write 
X = {r(u)|7 © G}. Then X is a finite subset of E’ because G is a finite group by 
Theorem 2 §10.1 (£ D F is finite by hypothesis). So let u = ui, u2,--* , Um denote 
the distinct elements of X and define f in E[z] by 


f = (a — uw) (@ — ug) +++ (2 — Um). 


If o € G, then cuy,oug,:++ ,7Um are distinct and they are elements of X because 
G is a group. Hence, they are the elements uj, u2,--+ , tm in a different order. Since 
fr f? is a ring isomorphism F[z] — F[z],11" this means that 


f? = Ty (@ — ui)? = WL (@ — ou) = Wy (@ — ui) = f. 


It follows that o fixes each coefficient of f. Hence, these coefficients lie in F’ by (1), 
and so f € F{a]. But p is the minimal polynomial of u in F' (being irreducible), 
so f(u) =0 implies that p divides f in F[z]. Hence, p splits in E and is separable 
(because u,; are distinct). This proves (2). 

(2) = (8). If £ = F, there is nothing to prove. Otherwise, choose u; € EF, us ¢ F, 
and let p, be its minimal polynomial over F. Then p; is separable and splits 
over E by (2), so let By C # be a splitting field of p,. If EH, = E, we are done; 
otherwise, choose ug € &, ug ¢ Ey, with minimal polynomial po over F. Again, 
pa is separable and splits over E by (2). Write fo = pipe, so that fo is separable 
and splits over &. Let Ho C E be a splitting field of fo. If Ez = , we are done. 
This process must stop because F' C By C Ep C-:.C E and [E: F] is finite by 
hypothesis. 


TIE f = ay +aye+ +++ an2", then f? =o(ao) + o(a1)e+---+o(a,)2”. 


428 10. Galois Theory 


(3) = (1) Consider FE D Eg D F. Given (3), the corollary of Theorem 5 §10.1 
shows that [EZ : F] = |G]. But [EZ : Eg] = |G| by the Dedekind—Artin theorem, so 
Eq = F and (1) follows because G° = Eg by definition. 

Finally, E D F is separable by the proof of (1)=(2). | 


Every algebraic extension of a field of characteristic 0 is separable, so we have 


Corollary 1. If char F = 0, the Galois finite extensions of F are precisely the 
splitting fields of polynomials in F[z}. 


Every finite Galois extension'!® is separable by Theorem 3 (but see Example 


3). However, the primitive element theorem (Theorem 6 810.1) gives 


Corollary 2. Every finite Galois extension EF D F is simple; that is, FE = F(u) for 
some u € EF, 


The next corollary proves again the corollary to Theorem 5 §10.1. 


Corollary 3. If EDF is a finite Galois extension and G = gal(E: F), then 
[E: F| =|GI. 

Proof. The proof of (3) = (1) in Theorem 3 shows that Eg = F, so the Dedekind— 
Artin theorem applies. | 


Corollary 4. If EF D> K 3D F are fields and E D F is a finite Galois extension, then 
FE; > K is also a Galois extension. 


Proof. As E D F is Galois, let E be the splitting field over F of the separable 
polynomial f in F(z]. Then f € K[z], E is a splitting field of f over K, and f is 
separable over kK. Hence, # D K is Galois by Theorem 3. | 


If K D F is a finite Galois extension with intermediate field K, the easiest way 
to obtain elements in gal(K : F’) is often as the restriction to K of automorphisms 
in gal(E: F’) for some field HF D K. Hence, Corollary 5 is useful because it places 
no condition on the extension FE D F. 


Corollary 5. Let EDK DF be fields, where K DF is finite and Galois. If 
o €gal(E: F), then o(K) = K, soo restricts to an automorphism in gal(K : F). 

Proof. If u € K, let p be its minimal polynomial over F. Given o € gal(E: F’), we 
have pla(u)| = o[p(u)] = o(0) = 0, so o(u) € F is also a root of p. But p splits in K 
by Theorem 3, so o(u) € K. This proves that o(K) C K. Similarly, o~!(K) C K, 
so o(K) = K, as asserted. 2 


Example 4. If CD K DQ, where K is the splitting field of a polynomial in 
Q[z], then K D Q is Galois (Corollary 1), so complex conjugation restricts to an 
automorphism in gal(i : Q) by Corollary 5. 


The Main Theorem 


Until now, all our results have been valid for arbitrary subgroups of the Galois 
group. However, the normal subgroups play a special role, and the property in 
Corollary 5 is the analogue for intermediate fields of normality for subgroups. Again, 


118Sometimes called a normal extension. However, the term “normal” is used in other ways. 


10.2. The Main Theorem of Galois Theory 429 


our terminology follows Kaplansky. If FE D K D F are fields, the intermediate field 
K is called stable in the extension E D F if 


o(K) C K for all o € gal(E : F), equivalently 
o(K) = K for all o € gal(E: F). 


Clearly, both # and F are stable in FE D F. And if E D F is Galois, then every 
intermediate field K is stable in # D F (Corollary 5). 


Lemma 5. Let ED F be fields and let G = gal(E:: F). Then: 
(1) If H is a normal subgroup of G, then H° = Ey is stable in E D F. 


(2) If K is a stable intermediate field, then K' = gal(E: K) is normal in G 
and G/K' = {\ € gal(K : F) | \ extends to an automorphism of E}. 


Proof. (1) Given H 4G, let ao €G. We must show that o(H°) C H°, that is, 
o(u) € H® for all u € H°. But ifr € H, then o-'ro0 € H, so (0 170)(u) = u. Thus, 
Tlo(u)] = o(u) for all r € H; that is, o(u) € H°. 

(2) If K is stable and o € G, then o(K) = K, so the restriction o|k : K > K 
of o to K is in gal(K : F’). Hence, define y: G > gal(K : F) by y(c) = o|x for all 
oa € G. This is a group homomorphism and 


kery = {0 €G|o(u) =u for alluc K} = K’. 


Finally, o(G) = {A € gal(K : F) | \ extends to an automorphism of F}. i 


Finally, we are ready to prove the most important theorem of this chapter. 
Recall that |G : H| denotes the index of H in G where H C G are finite groups. 


Theorem 4. The Main Theorem of Galois Theory. Let EDF be a finite 
Galois extension with Galois group G = gal(E’: F), let K and K, denote interme- 
diate fields of the extension F D F, and let H and H, denote subgroups of G. As 
before, write K' = gal(E': K) and H° = Ey. 
(1) All H and K are closed. The maps K +> K’ and H+ H° are mutually 
inverse, order-reversing bijections between the set of all intermediate fields 
K of the extension E D F and the set of subgroups H of G. 
(2) If ky C K, then [(K : Ki] =|K{: K’|. 
(3) If H, C H, then |H : Hy| = [A : A’). 
(4) EDK is a Galois extension. 
(5) KDFisGalois = ifK isstableinEDF © K'aG. 
In this case, G/K' = gal(K : F). 


Proof. Observe first that (4) is Corollary 4 of Theorem 3. 

(1) By Lemma 3, it suffices to show that all K and H are closed. We have 
H C H” by Lemma 2; to prove equality, note that |H| = [EZ : H°| by the Dedekind— 
Artin theorem. Replacing H by H”, we get |H”|=[E:H°°]) =[E: H°] =|H]. 
Hence, H = H” and dH is closed. Turning to K, write H; = gal(Z: K). Then 
E> K is Galois by (4). Hence, K = {u € E'| o(u) = u for all o € Hy} = HY and 
this is always closed by Lemma 2. 


430 10. Galois Theory 


(2) By (4) ED K is Galois, so [E': K] = |gal(E : K)| =|K’| by Corollary 3 of 
Theorem 3. Similarly, [EF : Ki] = |K{|. Hence, 
ogee, Es Bal a 

[eta] = ay = IR 

(3) Write H° = K and H? = K,. Then K C Ky and (1) gives K'= H” = H 

and Ki = H?! = H,. Hence, (2) implies that 


eT =e IO a Ge | Ae ee 


(5) Since K is closed by (1), we have K’? = K. Then Lemma 5 shows that K 
is stable if and only if K’ <G, and also gives G/K'= gal(K: F). If K DF is 
Galois, then K is stable by Corollary 5 of Theorem 3. Conversely, if K is stable, let 
ué€K\F. As E 2 F is Galois, o in G exists such that o(u) # u. But the restriction 
of o to K is in gal(K : F) because K is stable, and hence K D F is Galois. | 


ea bean cl F 


The main theorem has many uses because many properties of intermediate fields 
can be deduced from the analogous properties of the subgroups of the Galois group. 
To illustrate, we reprove an important property of finite fields (Theorem 5 §6.4). 


Corollary. If E = GF(p"), where p is a prime, then ED Z, is Galois and the 
subfields of E are precisely the fields GF (p™), where m|n. 


Proof. E is the splitting field of f = x?” — x over Z, (Theorem 4 §6.4) and f’ = —1, 
so f has no repeated roots in F (Theorem 3 §6.4). Hence, F D Z, is Galois. Next, 
G = gal(£: Z,) =C, by Example 6 §10.1. Thus, G has exactly one subgroup of 
order m for each divisor m of |G'| = [E : Zp] = n, so the main theorem gives exactly 
one intermediate field K with [E : K] =m. It must be GF(p™). | 


The main theorem shows that if F D F is a finite Galois extension, the lattice 
of intermediate fields has the same form as the (inverted) lattice of subgroups of 
G = gal(E : F). Moreover, if H is a subgroup, then H = K’, where K = H°. Hence, 
(5) of the main theorem translates to 


HaG if and only if H° > F is Galois. 


Examples 5 and 6 illustrate this. 


Example 5. Let E = Q(u,v), where u = /2 and v = V3, and let G = gal(E : Q). 
Show that G = (a,7) & C2 x C2, where o and 7 are defined by o(u) = u, o(v) = —v, 
and t(u) = —u, T(v) = v. Hence, find all the intermediate fields in EF D Q. 


Solution. The minimal polynomial of v is xz? — 3, and the other root is —v. Hence, 


there is a Q-automorphism 9 : Q(v) + Q(—v) satisfying o9(v) = —v. This mapping 
extends to an automorphism o of F = Q(v)(u) = Q-v)(u) that satisfies o(u) = u. 
This creates a, and we construct 7 in the same way. Now |G| < 4 because A(u) = tu 


and A(v) = +0 for all A € G. Because o(a) = 2 = o(r) and or = 70, it follows that 
G= (0,7) Cy x Co. 

The subgroups of G are {e}, G, Hi = (co), Ho =(r), and H3 = (or). The 
subgroup lattice (inverted) is shown in the right-hand diagram. Hence, the lattice 
of intermediate fields is as shown in the left-hand diagram. 


10.2. The Main Theorem of Galois Theory 431 


wo oi 
See NTA 


Now ED Q is Galois—E is the splitting field of (x? — 2)(x? — 3), so the main 
theorem ensures that all intermediate fields can be obtained in this way. Also, 
Hi; < G for each 7 guarantees that H? D Q is Galois. 

Finally, the main theorem is useful in the actual computation of the intermedi- 
ate fields. Clearly, u € H?, so Q(u) C H?. But [Q(u) : Q| = 2 because x? — 2 is the 
minimal polynomial of u, and so [H?: Q|=[A?:G°]=|G:Hi|=2 by the 
main theorem. Hence, H? = Q(u), and similar arguments give H$ = Q(v) and 
H3 = Q(uv) because or(uv) = uv. Oo 


Example 6. Let E be the splitting field of x? — 2 over Q. Thus, F D Q is a Galois 
extension by Theorem 3 because charQ=0. In Example 7 §10.1, we showed 
that G = gal(E : Q) & Ds. In fact, write u = V2 and w = e?"*/3, Then G = (o,7), 
where o(o) = 8, o(r) = 2, and oro = 7 and o and 7 are defined by o(u) = uw and 
o(w) = w, whereas 7(u) = u and 7(w) = w*. Thus, the subgroups of G are {e}, G, 
H = (oc), Hy =(r), Ha = (ro), and H3 = (ro”). The (inverted) subgroup lattice 
is shown in the right diagram along with the index of each group extension. The 
left diagram gives the lattice of intermediate fields of the extension FE D Q along 
with all the dimensions. The main theorem guarantees that the dimensions and in- 
dices correspond, as indicated in the figure. Moreover, the fact that H is normal in 
G but H,, Ho, and H3 are not shows that H° D Q is a Galois extension, whereas 
AH? DQ, HS D Q, and H3 D Q are not. 


PS. fF, 


VE" NE 


Again the main theorem is useful in the computation of the intermediate fields. 
We have o(w) = w so, as H = (a), Q(w) C H°. But [Q(w) : Q] = 2 because the 
minimal polynomial of w is 2? +2+1, and [H°: Q| =[H°: G’] = |G: H| =2 by 
the main theorem. Hence, H° = Q(w). Similarly, 7(u) = u implies that H? = Q(u). 
To find primitive elements for Hj and H$, we must find elements of E \ Q fixed 
by ro and ra”, respectively. Now each of these automorphisms must permute the 
set {u,uw, uw} of roots of 2? —2 and because these maps have order 2 in G, 
each must fix one of these roots. A routine check reveals that ro(uw) = uw and 
To?(uw") = uw’, so H3 = Q(uw) and H§ = Q(uw’). O 


The main theorem is not the end of the Galois theory, but rather the begin- 
ning. The study of abelian Galois groups leads to class field theory, an active 
research area with applications to algebraic number theory. If fields are replaced 


432 10. Galois Theory 


by commutative rings! or by division rings,!2° much of the theory still applies, 
suitably modified, and a version of the main theorem holds in each case. Ritt and 
Kolchin!?! developed a differential Galois theory in which differential equations 
replace polynomial equations and in which a type of main theorem is proven. Other 
Galois-type theories also exist; the idea of a Galois correspondence occurs frequently 
and often gives important information about the objects that correspond. 


Evariste Galois (1811-1832) Galois was born near Paris to well-educated parents and 
after tutoring by his mother, entered school at the age of 12. His routine school work was 
mediocre, but he discovered Legendre’s Elements de Géométrie, which captivated him; 
it is said he read it like a novel and mastered it in one reading. He then went on to works 
of Lagrange and Abel and at the age of 15, was reading professional-level material and 
beginning to make discoveries of his own. Unfortunately, his work was not systematic, 
with much of the calculation done mentally and only the results written down. He tried 
twice to enter the Ecole Polytechnique but was rejected because of his lack of systematic 
preparation. This rejection was a great loss for mathematics because the Ecole, which 
had produced many great mathematicians, may have been able to recognize his genius 
and provide the environment he needed. 


Nonetheless, Galois continued to make fundamental discoveries about polynomial equa- 
tions and, in 1829, submitted some of his results to the Académie des Sciences. The 
referee was Cauchy, who was certainly competent to understand it, but Cauchy lost 
the manuscript and it was never seen again! Undaunted, Galois submitted his work in 
the 1830 competition for the Académie's grand prize in mathematics. The article should 
have won this highest honor for its author, but the secretary, Fourier, took the manuscript 
home and, incredibly, died before reading it. The manuscript was lost. Finally, Galois 
sent a second memoir to the Académie. This time Poisson reviewed it and declared it 
to be “incomprehensible.” 


Whether because of bitterness over these events or because of his father’s republican 
sympathies, Galois reacted by blaming the Bourbon regime and joining the National 
Guard, a republican organization. It was a time of great political unrest in France, and, 
as a result, he was in and out of prison, regaining his freedom in 1832. At this time, 
he became involved with a girl. The details of this liaison are obscure but one thing is 
certain: He was challenged to a duel and felt honor bound to go through with it. He 
had a sense of foreboding about the duel and wrote, “I die the victim of an infamous 
coquette, It is in a miserable brawl that my life is extinguished. Oh! why die for so 
trivial a thing....” The night before the duel, he wrote a letter to a friend outlining his 
discoveries. It is a tragic, poignant document with comments such as “I have no time” 
scribbled in the margins, and it ends by asking that Jacobi or Gauss give their opinion 
“not as to the truth, but as to the importance of these theorems." Hermann Weyl, one 
of the greatest mathematicians of the twentiethth century, has written that “...this 
letter, if judged by the novelty and profundity of the ideas it contains, is perhaps the 
most substantial piece of writing in the whole literature of mankind.” 


119Chase, S.U., Harrison, D.K., and Rosenberg, A., Galois Theory and Cohomology of Commuta- 
tive Rings, American Mathematical Society Memoir 52, Providence, RI: American Mathematical 
Society, 1965. 

#20 Jacobson, N., Structure of Rings, Colloquium Publications XXXVH, Providence, RI: American 
Mathematical Society, 1964. 

121See Kaplansky, I., An Introduction to Differential Algebra, Paris: Hermann, 1957. 


10.2. The Main Theorem of Galois Theory 433 


The duel was with pistols at 25 paces. Galois was hit in the stomach and lay 
where he fell until a passing peasant took him to a hospital. He died the next day, 
May 31, 1832, at the age of 20, and was buried in the common ditch at the cemetary of 
Montparnasse. 


Exercises 10.2 


1. 


13. 


14. 


In each case, show that ED F is Galois, find the lattice of intermediate fields, and 
find a primitive element for each intermediate field. 

(a) E = Q(u), where u = e?4/5, F =Q, 

(b) E = Q(u), where u = e?7/7, F =Q. 

(c) E= Q(i, V3), F=Q. 

(d) E = Zo(u), ua root of ¢+2+1, F=Zp. 

(ce) E= Q(¥2,1), F =Q. [Hint: Exercise 13 §10.1.] 


. In each case, describe all possible intermediate field lattices for a finite Galois exten- 


sion # D FP. 
(a) |gal(B : F)| = p?, where p is a prime. [Hint: Theorem 7 §8.2.] 
(b) | gal(E : F)| = 2p, where p is a prime. [Hint: Theorem 3 §2.6.] 


. If H=GF(p"), use the Dedekind—Artin theorem to show that E DZ, is a Galois 


extension and display the lattice of subfields of GF(p!*) in terms of the Frobenius 
automorphism of E. [Hint: Example 6 §10.1 and Theorem 3.] 

(a) If H = (X), X C gal(#: F), show that H° = {ue E | o(u) = uw for allo € X}. 
(b) If X is finite and K = {u € E'| o(u) = u for all o € X}, show that K is an inter- 
mediate field and that [FE : K] > |X|. 


. Let B= F(t) be the field of rational forms over a field. In each case, compute 


K = Eg and find the minimal polynomial m € K[z] of t over K. 

(a) G = (oc), where o is that F-automorphism of E given by o(t) = —t. 

(b) G = (c), where o is that F-automorphism of FE given by o(t) = 1—¢. 

Show that a finite Galois extension has a finite number of intermediate fields. 

Let ED K D F be fields. If E D F is finite and Galois and if gal(# : F) is abelian, 
show that K D F is Galois. 

Let EDF be finite and Galois, where gal(E': F) is cyclic. If k divides [E: F], 
show that there is exactly one intermediate field K such that [fF : K] =k. 


. Let E D F be fields with G = gal(E : F) and consider the Galois connection. 


(a) Show that H ++ H® is onto if and only if every intermediate field is closed. 
(b) Show that K + K’ is onto if and only if every subgroup of G is closed. 


. Let ED F be fields with G = gal(E: F). If H CG is a subgroup and H™ is finite, 


show that H is closed. 


. If ED KD F are fields, show that E D K is Galois if and only if K is closed as an 


intermediate field of BDF. 


. If ED F is finite and G=gal(E: F), show that ED F is Galois if and only if 


|G| = [EB : Fl. 

If ED F is a finite Galois extension with gal(Z: F) = A, show that there is no 
intermediate field K with [EF : K] = 6. [Hint: Exercise 34 §2.6.] 

Let E D F be a finite Galois extension, write G = gal(E’: F'), and consider the in- 
termediate field K = {u € E | or(u) = 7o(u) for all o, 7 € G}. Show that K D Fis a 
Galois extension with abelian Galois group. : 


434 10. Galois Theory 


15. Let E D F bea finite Galois extension. If K and L are intermediate fields, let K V L 
denote the intersection of all intermediate fields containing K and L. The field K V L 
is called the compositum of K and L. 

(a) Show that (KV L)'=K'nL'. 
(b) Describe the group (K ML)’ in terms of K’ and L’. 

16. An extension F D F is called abelian (respectively, cyclic) if it is finite, Galois, 
and the Galois group G = gal(E : F) is abelian (respectively, cyclic). If HD K DF, 
where E' > F is abelian (respectively, cyclic), show that both # D K and K D F are 
abelian (respectively, cyclic). 

17. Let K and K, be intermediate fields in a finite Galois extension EF D F. Show that 
K' and Kj are conjugate subgroups of G = gal(E: F’) if and only if K = o( Kj) for 
some o € G. (K and Ky, are called conjugate intermediate fields in this case.) 

18. Let EF D F be fields with G = gal(E: F). If K is an intermediate field and K D F is 
a finite Galois extension, show that o(K) = K for all o € G. [Hint: Theorem 3.] 

19. Let f € Fla], let F D F be a splitting field of f over F, and let G = gal(E: F'’). 

(a) Show that G can be embedded in S;,, where f has m distinct roots in E. [Hint: 
Theorem 3 §10.1.] 
(b) If f is separable, show that [E : F'] divides n! [Compare with Theorem 2 §6.3.] 

20. Let E D F be a finite Galois extension with Galois group G = gal(E': F). Ifue E, 
define the norm N(u) = Ng;r(u) and the trace T(u) = Tg/r(u) by 

N(u)= [[ o(u) and = T(u) = YF o(u). 
céG ocG 
(a) Show that N(u) and T'(u) a in F. [Hint: G° = F'] ° 
(b) Show that N(uv) = N(u)N(v) and T(u+v) =T(u) + T(v) for all u,v € E. 
(c) Let K=F(u) and let p=a2™+a,-42"!+---+a,2+a) be the minimal 
polynomial of u over F. If K DF is Galois, show that Nx/r(u) = (—1)"ao and 
TKjr(u) = —Gn-1- 

21. Let EDF be a finite Galois extension with Galois group G. If u€ FE, define 
f =[[oeg(x — o(u)). Show that f € F[x] and is a power of the minimal polynomial 
of u over F. 


10.3. INSOLVABILITY OF POLYNOMIALS 


Possibly the best known result in algebra is the formula for the roots u, and ug of 
the equation 2? + ba +c=0. By completing the square, we write it in the form 


(a + 4b)? = 1(8? — do). 
Hence, we obtain the roots from 
uy = 3(—b+ Vb? —4c) and = un = $(—D — Vb? — Ac). 


This is called the quadratic formula and was known in antiquity. The expression 
A = b? — 4c is called the discriminant of the quadratic x? + bx +c. 

It was not until the sixteenth century that such a formula for the cubic was found. 
Given y?+ry?+sy++t, the substitution y=x—$r gives y3+ry*+syt+t=2*+be+c 
for appropriate b and c. Hence, we need to find only formulas for the roots of cubic 
equations of the form 


oe +br2+c=0. 


10.3. Insolvability of Polynomials 435 


In this case, if the roots are uz, ue, and uz, the cubic factors as 
a? + bz +c = (x —u)(z — u2)(z — ug), 


so the roots are related to the coefficients as follows: 
Uy + ug + us = 0, 
Uj U2 + U1U3 + UgUg = J, 

uzUQU3 = —C. 


Now let w = e?**/3 be a cube root of unity, so that w? = 1 and1+w-+w? =0. We 
look for formulas for the roots of the form 


u=ptq w=wptwg and u=wp+w9. (*) 


Here, p and q are to be determined. Then the condition u, +ug+u3 =0 is 
automatically satisfied because 1+ w+ w* =0, and the other two requirements 
reduce to pg = —b/3 and p* + q? = —c, respectively. These equations imply that 
p® and q® both satisfy the quadratic equation x? + ca ~ b?/27 = 0. The quadratic 
formula then gives 


1/3 1/3 

p= E (-c+ \/2 + f09)| and a= [3 (-c- \/2 + #08) 
If we choose the cube roots so that pq = —b/3, then (*) gives the three roots of 
x® + ba + c = 0. This expression is called the cubic formula and was first discussed 


by Scipione del Ferro (ca. 1465-1526). Incidentally, the quantity A = —4b° — 27c? 
is called the discriminant of x? + bx +c, so the quantities p and q are given by 


‘3 (c+ 7A). 

Niccold Tartaglia later rediscovered the cubic formula, and Girolamo Cardano 
published it in 1545 in his book Ars Magna. The book also contained Lodovico 
Ferrari’s method for solving quartic equations, which led to many attempts in the 
seventeenth and eighteenth centuries to find a formula for the solution of quintic 
equations. Both Euler and Lagrange tried it and failed, although Lagrange suc- 
ceeded in unifying the lower degree methods. In 1824, Abel gave the first conclusive 
proof that no such formula exists; the proof we give is due to Galois. 

Clearly, the quadratic and cubic formulas are valid over any field F satisfying 
char F # 2,3, and the roots can be found in an extension field of F obtained by 
adjoining square and cube roots of elements of F. If F D F are fields, EF is called a 
radical extension of F if a chain 


B=fpo DE, DEgD:::DE, =F 


of intermediate fields £; exists such that 
y= Eiti (ui), where u;* € Fiat for some n; > 1. 


A polynomial in F'[z] is called solvable over F if all its roots lie in some radical 
extension of F, equivalently if some radical extension of F’ contains a splitting field. 
Note that every radical extension is finite and every finite field is a radical extension 
of any subfield. 


436 10. Galois Theory 


Thus, a polynomial f in F'[z] is solvable if and only if we can find the roots of 
f (in some splitting field) by using only operations of the field F and adjoining nth 
roots. Clearly, these operations yield the roots of quadratic and cubic polynomials 
(by the preceding formulas), so all quadratics and cubics are solvable (provided 
char F # 2,3). This statement also holds for quartics!?? but fails for quintics: 
There is a polynomial of degree 5 in Q(z] that is not solvable. 

Let f be a polynomial in F'[z], where F is a field. If F D F is any splitting field 
of f over F, the group gal(E : F) is called the Galois group of the polynomial 
f. Note that the definition of the Galois group of a polynomial does not depend on 
which splitting field is used. In fact, if H D F is another splitting field of f, there is 
an F-isomorphism E — FE, which implies that gal(E : F) & gal(E: F). 

Galois’ idea was to characterize solvable polynomials by a property of their 
Galois groups. Recall that a group G is called solvable if there is a chain of 
subgroups 


G=G 2G: 2Gn={} 


such that Gi41 < G; and G;/Gi+1 is abelian for each i. Clearly, every abelian group 
is solvable, and we discussed these groups at length in Section 9.2. The result we 
need is Theorem 4 §9.2, which we restate as Lemma 1 for reference. 


Lemma 1. If G is a group and K is a normal subgroup of G, then G is solvable if 
and only if both K and G/K are solvable. 


Note that the only solvable simple groups are abelian. Hence, the symmetric group 
Sy, is not solvable if n > 5, because otherwise its normal subgroup A, would be 
solvable by Lemma 1, contrary to the fact (Theorem 8 §2.8) that it is simple and 
nonabelian. 

Now we can give Galois’ approach to the insolvability of the general quintic. 
The key result is Galois’ criterion. 


Galois Criterion. Let F be a field of characteristic 0. Then a polynomial in 
F(z] is solvable over F if and only if its Galois group is a solvable group.!*° 


Thus, Galois simply produced a polynomial of degree 5 in Q{z] whose Galois group 
is Ss (and hence not solvable). Because this polynomial is not solvable, some root 
cannot be expressed using only rational operations and the extraction of nth roots. 
Clearly, only half of Galois’ criterion is needed: Solvable polynomials have solvable 
Galois groups; this is, Theorem 2 (we do not prove the converse!?*). Here is an 
example of a polynomial that is not solvable. 


Example 1. Let p = x° — 6x + 2 in Q[z]. Show that the Galois group of p is S5 
and hence that p is not solvable. 


122See, for example, Ehrlich, G., Fundamental Concepts of Abstract Algebra, Boston: PWS- 
KENT, 1991, p. 327. 


228This criterion is the source of the term solvable group. 
124See Rotman, J., Galois Theory, Berlin: Springer, 1990, p. 55. 


10.3. Insolvability of Polynomials 437 


Solution. We let EF D Q be a splitting field 
of p so that G = gal(E': Q) is the Ga 
lois group of p. We show that G& S; 
by identifying G as a group of permu- 
tations of the set X of roots of p in C 
(see Theorem 3 §10.1). Now p’ = 5x4 — 6, 
which has real roots ta, a= +/6/5. 
We can easily verify that p(a) <0 and 
p(—a)>0, so the graph of p is as 
shown in the figure. In particular, p 
has three distinct real roots and two 
(conjugate) nonreal roots.!2° Hence, complex conjugation induces the transposition 
in G that exchanges the two nonreal roots (by Example 4 §10.2). However, p is 
irreducible by the Eisenstein criterion and so is the minimal polynomial of any root 
u in E. Hence, (Q(u) : Q| = deg p = 5. But |G| = [EF : Q| because EF D Q is a Galois 
extension, so 5 divides |G|. Thus, G contains an element of order 5 by Cauchy’s 
theorem (Theorem 4 §8.2). The only elements of order 5 in Ss are the 5-cycles, 
which shows that G contains a 5-cycle and a 2-cycle (conjugation). Finally, this in 
turn implies that G = Sx by Lemma 2 below, so G & Ss, as required. 0 


The proof of the next theorem requires two lemmas that are of independent 
interest. The first involves the symmetric group Sp, where p is a prime. 


Lemma 2. If p is a prime, Sp is generated by any p-cycle and any 2-cycle. 


Proof. Choose the notation so thato =(1 2 --- p) andr =(1 k) are the given 
cycles. Now o*-!(1) = k and o*~! is a p-cycle (as p is prime), so we may assume 
thato =(1 2 ++» p)andr=(1 2). Hence (k +1k +2) =o*ro-* for each k (by 
Lemma 3 §2.8). Because (12), (13),---,(1 p) generate S, and because we have 
(1 a+1)=(1 a)(a a+1)(1 a), the proof is complete. | 


Lemma 3. If G is a cyclic group of order n, then autG = Zt. 


Proof. Since G = Z,, we show that autZ, = Z}. Ifm € Z?, define gm: Zn — Zn 
by om(k) = mk. This is an automorphism of Z, because m is a unit in the ring Zp, 
so we have a mapping 


6: Z% — aut Zn given by 0(m)=om. 


This is a homomorphism because Omm = OmOm', and it is one-to-one because 
Om =Om implies m=on(1) =om(1) =m’. Finally, given o € autZ,, write 
m = o(1). Then m is a generator of Z, and so ged(m,n) = 1. Thus m € Z, and 
o = Om, because o(k) = o(1)k = mk = om(k) for all k € Z,. Hence, o = 6(m), so 
# is an isomorphism. 


As in the derivation of the cubic formula, the nth roots of unity (that is, the 
roots of x” — 1) play an important role in the proof of the Galois criterion. If F is 
any field, a root of unity w in some extension field F is called a primitive nth root 
of unity over F if |w| =n in E*. Clearly, w = e?"*/" is an example in C. In general, 
such an element w exists in an extension field of F if and only if p = char F does not 


125The tacit assumption is that p splits in C, which the fundamental theorem of algebra guarantees. 


438 10. Galois Theory 


divide n (this is intended to include the case char F = 0). Indeed, if n = pd, then 
x” — 1 = (x4 —1)?, so no primitive nth root of unity can exist over F'. Conversely, 
if p does not divide n, then 2” — 1 and its derivative nz” are relatively prime 
in F[z], so x" — 1 has n distinct roots in any splitting field E D F' (by Theorem 3 
§6.4). These roots form a subgroup of E* of order n that is cyclic by Theorem 7 
86.4. A generator of this group clearly is a primitive n*" root of unity, which proves 
the first statement in Theorem 1. 


Theorem 1. If F is a field and n > 1 is an integer, a primitive nth root of unity 
w over F exists if and only if char F' does not divide n. In this case, 
(1) F(w) is the splitting field of c<” — 1 over F. 
(2) F(w) D F is a finite Galois extension and gal(F'(w) : F) is isomorphic to a 
subgroup of ZF. 


Proof. Here, F(w) is the splitting field of x” — 1 over F' because (as w is primitive) 
the roots are 1,w,-:-,w" +, which all lie in F(w). Moreover, these roots are 
distinct, so «”—1 is separable over F. Hence, F'(w) > F is Galois by Theo- 
rem 3 §10.2. Finally, if o € gal[F(w) : FJ, then o induces an automorphism go of 
(w) = {1,w,::,w"} by restriction, and the map 0 ++ 9 is a one-to-one group 
homomorphism. But aut (w) = Z* by Lemma 3, which completes the proof. a 


From the definition of a radical extension, any discussion of Galois’ criterion 
clearly involves extensions F'(u) D F, where u” € F. Lemma 4 will be needed. 


Lemma 4. Let F be a field containing a primitive nth root of unity and consider 
an extension F(u), where u” € F. Then F(u) DF is a Galois extension and 
gal{F’(u) : F'] is abelian. 


Proof. Let w € F be a primitive nth root of unity and write u*°=aeéF. Then 
x” —a has roots u, uw,-:- ,uw”! in F(u), distinct because w is primitive. Hence, 
x” — ais separable over F’, and F'(u) is the splitting field. Then F(u) D F is Galois 
by Theorem 3 §10.2. Finally, if o and 7 are in gal[F(u) : F, then o(u) and r(u) are 
roots of 2” — a, say, a(u) = uw! and 7(u) = uw). Because o(w) = w = 7(w), this 
gives To(u) = uw*td = or(u). Thus, ot = 70, so gal[F'(u) : F'] is abelian. 


With this result, we can prove the half of Galois’ criterion needed in Example 1. 


Theorem 2. (Galois)Let E D F be a radical Galois extension, where char F = 0. 
Then gal(E : F’) is a solvable group. 


Proof. It suffices to find a field K D> E D F, where K 3D F, is Galois and gal(K : F’) 
is solvable, because then gal(E : F) is an image of gal(K : F’) by the main theorem. 
This uses the hypothesis that & D F' is Galois; we use the assumption that EH DF’ 
is radical to construct K. Let 


E=K)2B, 2-25, =F, 


where B; = E;41(u;) and u;* € Ey41 fori =0,1,--+ ,r—1. Write n = nom - ++ np-1 
and (as char F = 0) let w be a primitive nth root of unity over F. Define K; = E;(w) 
for0 <i<rand write K = Ko = E(w). Then KX is the splitting field of «” — 1 over 
EF, and F is the splitting field over F of some f in F[z] by Theorem 3 §10.2. Hence, 
K is the splitting field of f(x) (a" —1) over F. Thus, K D F is Galois (because 


10.3. Insolvability of Polynomials 439 


char F = 0), and it remains to show that gal(K : F) is a solvable group. Write 
K,y41 = F and consider the chain of fields 


K=Kkjo 2K, 2:::DK,DK4w=F. (*) 
Claim 1. K; D Kj41 is Galois and gal(K; : Ki41) is abelian for each i = 0,1,2,:-+ , 7. 


Proof. If i =r, the claim follows from Theorem 1 because K, = F(w) = K,r41(w). 
If i <r, the claim follows from Lemma 4 because Kj = Fj11(w,w) = Ki+i (ui); 
u;? € Bisa C Ki41, and Kj41 contains a primitive n;th root (namely, wr/"), This 
proves the claim. 


Now (**) gives rise to the chain of Galois groups: 
{e} = gal(K : Ko) C gal(K : Ki) C--- C gal(K : K,) C gal(K : Kp41). 


Because K,41 = F, the proof is complete if we can show that gal(K : K;) is a 
solvable group for each 1 =0,1,2,---,r+1. This is clear if i=0, so assume in- 
ductively that gal(K : K;) is solvable and consider K D K; D Kj41. Then the 
extension K D K;41 is Galois by the main theorem (applied to K D Kji1 D F) 
and K; D> Kj41 is Galois by the claim. Hence, the main theorem shows that 
gal(K : K;) is anormal subgroup of gal(K : K;1) and that the factor is isomorphic 
to gal(K; : Ki41) and so is abelian. Then gal(K : Kj41) is solvable by Lemma 1, 
and the proof is complete. a 


Theorem 2 (together with Example 1) settles the question of solvability of poly- 
nomials of degree 5 in the negative, but it leaves the higher degree cases open. 
However, we can use the main theorem to exhibit (for every n > 2) a polynomial 
of degree n whose Galois group is S,, and so is not solvable if n > 5. We devote the 
rest of this section to this piece of classical algebra. 

A key aspect of Galois theory is that if E D F is a splitting field of the poly- 
nomial f in F[z], then each automorphism in gal(Z : F) permutes the roots of f. 
Moreover, the coefficients of f are functions of the roots that remain unchanged 


when the roots are permuted. For example, if the roots are u 1, ug, and us, 
f = (a@—u1)(@ — u2)(% — us) 
= a? — (uy + Ug + Ug)? + (ure + U1u3 + Ugts)e — UrU2Us. 


(1) 


We formalize this idea as follows. 
If F is a field and F|a;| = Flay, %2,--- ,@p] is the polynomial ring in n inde- 


terminates 21, %2,°** ,@m, a polynomial s(2;) = s (#1, %2,+-+ ,@p) in F'[a,| is called 
symmetric if 
SIP A Body Bon) = Saya ta) for alla € Sy. 


Thus, 2?aoa3 + 212323 + 210902 and x} + 23 +23 are symmetric polynomials in 
F x1, £2,23]. The coefficients of the polynomial (1) are symmetric polynomials in 
the u,;, and these polynomials play an important role in what we do next. 

Given F[z;] = F[a1,22,...,2n], the elementary symmetric polynomials 
80, $1, $2;°** , Sn in F'[2,] are defined as follows: 


Bea Page for k= 1,2,...,n. 
41 Sig <i <ip 


440 10. Galois Theory 


Thus, 8; = 2, + 2% +:::+%y and Ss, = 21%9°+'2y for any n. If n = 3, then 
80 = 80(%1, 22,23) = 1, 
81 = 81(%1, 22,03) = £1 +224 23, 
82 = 82(%1,%2,%3) = £1 Lo + 2123 + L2%3, 
83 = 83(21, £2, %3) = £12273. 

Hence, (1) is the case for n = 3 of the easily verified formula 

(a — u1)(@ — up) +++ (2 — Un) = 0) 
Zo” — 81 (uy)a? + s9(uj)a"-? +--+ + (-1)™5n(ui). 


We discussed this material at length in Section 4.5, and our interest here is in the 
use of (2) to calculate a certain Galois group. 


If v1,:- , Zn are indeterminates over a field F, let E = F(a1,-++ , 2%) denote the 
field of rational forms over F’; that is, Eis the field of quotients of the integral 
domain F[z;] = F[z1,--+ ,@]. Hence, the elements of F are quotients 

(xi 
r(2;) = ——~. 
(zi) 
where f and g are polynomials in the variables 71, %2,-+- ,2%n, andg #0. Ifo € Sp, 
we define 


@:E>E by — Gl[r(xi,-+: ,2n)) =7(@o1,°++ Len): 


One verifies that & is an automorphism of EF that fixes F; that is, ¢ € gal(Z: F). 
Moreover, o +> & is a one-to-one group homomorphism S,, — gal(E : F’). Write its 
image as S, = {@|o € Sn} so that S, is a subgroup of gal(H : F) isomorphic to 
Sy. Our interest is in the fixed field S of S,, in E: 


S = Es, 
= {r(x;) | @(r) =r for allo € S,,} 
= {r(ai,s+" 3 2n) | reais?) Gon) =P (iss, Pn) forall. oe SH. 


This is called the field of symmetric rational forms over F. Note that the 
Dedekind—Artin theorem gives 


[E: S] = |S,| =n! (3) 
In fact, we claim that E DS is Galois and S = F(so, 81,:++ ,8n) = F(s;), where 
80, 81,'** 8 are the elementary symmetric polynomials in the variables x;. Clearly, 


If ¢ is an indeterminate over EF, consider the polynomial f(¢) in F'(s,;){t] given by 
f(t) = (€— a1) — 22) +++ (b— ap) = — 8yt? 1 +--+ + (—1)" Bp. 


Then £ is the splitting field of f(t) over F(s;) and so, as deg f =m, we have 
[EZ : F(s;)] <n! by Theorem 2 §6.3. This result, along with (3), shows that 
S = F(s;). Moreover, f(t) is clearly separable over S=F(s;), so EDS is a 
Galois extension (by Theorem 3 §10.2) and |gal(£: S)| =[£: S] =n!. Because 
Sn C gal(E : S) has also order n!, this gives gal(E: 5) =S,. The next theorem 
collects these results. 


Theorem 3. Let F be a field, let E = F(x1,:-- ,t») be the field of rational forms 
over F’, and let S C FE be the subfield of all symmetric rational forms. 


10.3. Insolvability of Polynomials 441 


(1) EDS is a Galois extension, [E : S] = nl, and gal(E': S) & Sy. 

(2) S = F(so, $1, $2,°*' ,8n), where s; are the elementary symmetric 
polynomials in the x;, and FE is the splitting field over S of the 
polynomial f(t) = ¢™ — s1t"-1 + sot?-? +... + (-i)"sp_. 


Corollary 1. Ifn > 5, a nonsolvable polynomial of degree n exists. 


Proof. The polynomial f(t) € S[t] in Theorem 3 is not solvable over S because its 
Galois group S, is not solvable if n > 5. i_| 


The fact that S = F(so,51,-+- , $n) in Theorem 3 means that every symmetric 
rational form in the variables x1, 22,--- , Zn is the quotient of two polynomials in the 
elementary symmetric polynomials 50,8 ,--+ , 8m in these variables. In fact, every 
symmetric polynomial in the x; is actually a polynomial in the s;, a fact proved 
(without any field theory) in Theorem 4 §4.5. 

Because every group of order n is isomorphic to a subgroup of S,, part (4) of 
the main theorem provides a bonus. 


Corollary 2. Every finite group is isomorphic to the Galois group of a finite Galois 
extension. 


Surprisingly, no one knows whether every finite group is isomorphic to the Galois 
group of a finite Galois extension of Q. Even small order groups can be complicated. 
For example,!*° the quaternion group Q is the Galois group of E D Q, where F is the 
splitting field of «8 — 722° + 180z* — 1442? + 36. On the other hand, in 1956 the 
Russian mathematician I. R. Shafarevich proved that if F’ D Q is a finite algebraic 
extension, and G is any finite solvable group, then G & gal(E : F’) for some Galois 
extension ED F. 


Exercises 10.3 


1. Find a radical extension of Q containing 


(a) V3(¥5 — V7) (b) (v5 — 3)(4— 376) 
2. In each case, show that f is not solvable by radicals. 
(a) f=2°—4a-2 (b) f=2°—6r?4+2 


3. Show that x7 — 142 + 2 in Q[z] has Galois group Sy. 

4, If p is a prime and f € Q[z] is irreducible of degree p, and if f has exactly two 
nonreal roots, show that f has Galois group S,. 

5. Show that every polynomial of degree at most 4 is solvable. [Hint: Theorem 3 §9.2.] 

6. If f is a separable, irreducible cubic in F[z], F a field, show that its Galois group is 
S3 or C3. [Hint: Theorem 3 §10.1.] 

7. Consider f = 2? ~3r+1 in Q[z]. Find the roots of f and determine the Galois 
group. 

8. Let f € Fla], where F is a field and char F + 2. Assume that deg f = n and that the 
roots U,,Ug,-°* , Un in a splitting field FH D F are distinct. If G = gal(E : F), view G 
as a group of permutations in Sx, where X = {u1,--- ,un}. [See Theorem 3 §10.1.] 


126TDean, R. A., American Mathemetical Monthly, 88 (1981), 42-45. 


442, 10. Galois Theory 


Define Ac E by A=[],2,(ui — uy), 80 ifn = 3, A= (uy — ug)(ur — ug)(ue — us). 
The element A? is called the discriminant of f. 

(a) Show that A? € F. 

(b) Show that the permutation o € Sx is even if and only if o(A) = A and that c is 
odd if and only if (A) =—A. 

(c) Show that F(A) corresponds to the even permutations in Sx in the Galois cor- 
respondence. 

(d) Show that G consists of even permutations if and only if A € F. 

(e) If f =a? + br +c, show that A? = b? — 4c, the usual discriminant. 

(f) If f = 23 + bx +c, show that A? = —4b% — 27c? is the usual discriminant. [Hint: 
Uy + U2 +ug = 0 and wyu2 + uug + Ugus = b imply that (u;, — u;)? = —b — 8u;u;.] 


10.4 CYCLOTOMIC POLYNOMIALS AND WEDDERBURN’S THEOREM 


If n is a positive integer, the irreducible factors of z” — 1 in Q{z] are called cyclo- 
tomic polynomials and are important in number theory. In this section, we derive 
several properties of these polynomials using Galois theory, and use them to prove 
a famous theorem of Wedderburn: Every finite division ring is a field. 

If F is a field and n > 1 is an integer, we let HF D F be asplitting field of 2” — 1. 
The roots form a subgroup of E*, which is cyclic by Theorem 7 §6.4, and this group 
has order n if and only if char F does not divide n (Theorem 1 §10.3).1?” In this 
case, a generator of this group is called a primitive nth root of unity over F’.. This 
group has exactly y(n) generators, where ¢ is the Euler function (Section 2.6). 

Let w1,W2,+** ,Wy(n) be the primitive nth roots of unity over a field F (where 
char F' does not divide n) and define 


®, = (x 3? wy) (x = we) an (x a We(n)): 


This is called the nth cyclotomic polynomial over F. Clearly, ®, € E[z]; in fact, 
®,, € F[xz]—see Theorem 1. 


Example 1. ©, = x —1 over any field. 
Example 2. If char F # 2, show that 64, = x? +1. 


Solution. Because 2* — 1 = (2 —1)(a +1)(xz? +1), the splitting field of x* —1 is 
F(w), where w? = —1. Hence, the primitive fourth roots of unity are w and —w, so 
O,=(r-w)(e+w) =2? +1. 0 
Example 3. Show that 6, = gP-1 4 oP-24...4¢41, where p# charF is 


a prime. 


Solution. The pth roots of unity are the (distinct) roots of x? — 1 and every one 
(except 1) is primitive because p is a prime. As a? —1= (x —1)®p, the result 
follows. O 


Note: For the rest of this section, we adopt the convention that if n >1 is an 
integer, d\n means that d is a positive divisor of n. 


127This is intended to include the case char F = 0. 


10.4. Cyclotomic Polynomials and Wedderburn’s Theorem 443 


Theorem 1. Let F be a field, where char F' does not divide n. 

(1) ®, € Fle]. 

(2) 2” —1= Ilan Ga. 

Proof. Let w be any primitive nth root of unity over F. 

(1) If E = F(w), then ©, € E[z] is clear. If o € gal(E': F), then o permutes 
the primitive nth roots of unity and so fixes every coefficient of ®,. But E D F is 
Galois (by Theorem 1 §10.3), so these coefficients are in F. 

(2) If d\n, the primitive dth roots of unity are precisely the elements of order d 
in U = (w). On the other hand, every element of U is a primitive dth root of unity 
for a unique positive divisor d of n. Thus, 


a” —1=Tyeu(# — u) = Tan [Muevo(uy=a(@ — u)| = Taj Ba. el 
If p is a prime, (2) gives x? —- 1 = 6,0, = (a — 1)@,. This in turn gives 
©, =aP 1 4+aP7%+...424+1, p any prime 
as in Example 3. In general, (2) in Theorem 1 gives a recursive method for deter- 
mining the polynomials ©,,: 


aot-1 


Be 4-100 
24= 6.8, = Sew = a? +1 
_ a 6: 2, 
26 B, 5,6, > @=iyeti a F2+l) a —a2+l, 
Note that all the coefficients of the @, are integers. Over Q this holds in general. 


Theorem 2. The cyclotomic polynomials ®,, over Q have integral coefficients. 


Proof. Use induction on n, beginning with ®, = x — 1. In general, (2) of Theorem 
1 gives 2" —1= @ f, where 


f = Wana ena 
has integer coefficients by induction. Also, f is monic (because each ®g is monic), so 
xz” —-1l=fq+r in Zz], where either r=0 or degr<deg f. But then r=(, — q)f 
forces r = 0 and so ®, = q € Zz]. a 


It is interesting to note that n = 105 is the smallest value of n for which ®, has an 
integer coefficient other than 0,1, or ~1. 

We can now prove a famous theorem of J. H. M. Wedderburn. He proved it in 
1905,!28 but the proof we give is due to Ernst Witt in 1931. It utilizes the class 
equation for a finite group given in Section 8.2 and also requires two preliminary 
results, the first of which is an easy consequence of the definition of the cyclotomic 
polynomials. 


Lemma 1. If d|n, then ®, divides (x" — 1)/(x? — 1) in 2[z]. 


Proof. Observe that ®, divides 
a” —1 = (at—1) x (42), 


128 Wedderburn, J.H.M., A Theorem on finite algebras, Transactions of the American Mathemat- 
ical Society, 6 (1905), 349-352. 


444 10. Galois Theory 


so it suffices to show that ©, and 2? —1 are relatively prime. But this follows 
because the roots in C of ©, are primitive nth roots of unity and so cannot be 
roots of 7% — 1. | 


Lemma 2. Suppose q? — 1 divides q” — 1, whereg > 1,d > 1, andn > 1. Then dn. 


Proof. Write n=ad+r in Z, where 0<r<d. Then, working modulo gq? —1, 
1=q" =q%*-q’ =q’, which implies that r = 0. Bg 


Theorem 3. Wedderburn’s Theorem. Every finite division ring is a field. 


Proof. If R is a finite division ring, let Z = {z € R| zr = rz for all r € R} denote 
its center. Then Z is a finite field, say |Z| =q. If {ri,-++ , rn} is a basis of R as 
a vector space over Z, then |R| = q". We consider the group R* and its center 
Z(R*) = Z*. Clearly, |R*| = q"—1 and |Z*|=q-—1. If R is not commutative, 
then n> 1, so the class equation for R* (Theorem 3 §8.2) reads as follows: If 
class u, class ug,-++ , class u,,, are the nonsingleton conjugacy classes in R*, then 


|R*| = |2*| + Dy |R* : N(us)]. (*) 


Now R; = {r € R| ru; = ur} is a division subring of R that contains Z and so has 
order |R;| = g? for some d;. Moreover, N(u;) = Ri, so |N(uz)| = q% —1, which 
divides |R*| = g” — 1 by Lagrange’s theorem. Hence, d;|n by Lemma 2, so 


|R* : N(u;)| = g* 1 


qti-l 


is a multiple of &,(qg) by Lemma 1 because 
|R* : N(us)| = | class us| > 1. Because 
®,(q) also divides |R*| = gq" —1, (*) im- 
plies that ©,(q) divides |Z*| = q—1. But 
if wi, we2,... are the primitive nth roots of 


unity in C, then w; #1 for each 7 because ‘= 
n> 1. Hence, |g — wi| > (¢—1) for each i 
(see the diagram), so 
|®n(q)| =Thlg- wil > (g- 1)’ 2q-1. 
This contradiction establishes Wedderburn’s theorem. | 


Wedderburn’s theorem can be extended. If R is a finite division ring, say |R| = n, 
Lagrange’s theorem shows that r”-! = 1 for allr #0 in R, sor” =r for allr € R. 
Hence, finite division rings are periodic, where R is called a periodic ring if, for 
allr € R,r” =r for some integer n (depending on r). Thus, Wedderburn’s theorem 
is a special case of Jacobson’s theorem: Every periodic ring is commutative. 


Returning to cyclotomic polynomials, we conclude this section by showing that 
®,, is irreducible in Q[z] for every n > 1 (so the factorization of x” — 1 into irre- 
ducibles in Q[z] is given by (2) of Theorem 1). This is true if n is prime by Example 
13 84.2, but to prove it in general, we need some notions from Chapter 5. 

If f #0 is a polynomial in Z[z], the gcd of its coefficients is called the content of 
f, denoted c(f), and f is said to be primitive if c(f) = 1. We can easily show (Lemma 
4 §5.1) that if e(f) =c, then f =cf1, where f; is primitive. The key observation 


10.4. Cyclotomic Polynomials and Wedderburn’s Theorem 445 


about this is Gauss’ lemma (Theorem 8 §5.1), which states that c(fg) = ¢(f)c(g) 
for all nonzero f and g in Z[z]. 


Lemma 3. Let f € Z[x] be monic. If f = piq, in Q|z], then f can be written as 
f = pq in Zz], where p and g are monic andp = rp; and q = sq, for some r and s 
in Q, 

Proof. Choose a,b € Z such that ap; = po and bq; = go are in Z[z] and then write 
Po = cp and qo = dq, where p,q € Z[z] are primitive. Hence, abf = cdpg. Because f 
is also primitive (being monic), Gauss’ lemma gives ab = c(abf) = cd. Hence, f = pq 
in Z[z], so, as f is monic, p and q may be assumed to be monic. As ap; = cp and 
bq, = dq, the proof is complete. ia 


Theorem 4. ©,, is irreducible in Q[z] for every n. 


Proof. Let w be a primitive nth root of unity. Because ®, is monic in Z[x] and 
®,,(w) = 0, the minimal polynomial m, of w over Q divides &,,. Hence, by Lemma 
3, write ©, = mf in Z[z], where m is monic, m(w) = 0, and m is irreducible in 
Q(z]. We demonstrate that degm = deg®, by showing that m(w*) =0 for all 
integers k relatively prime to n (every primitive root of unity is such a w*). 

To do so, it suffices to show that m(w?) =0 for any prime p not dividing n. 
Suppose that m(w?) #0 for such a prime. Then 0 = ©,(w?) = m(w?)f(w?), so 
f(w?) =0. Thus, w is a root of f(x), so (as m is irreducible) f(x?) = mg for 
g € Q(z]. But m is monic in Z[z], so f(x?) = qm-+r in Z[z] where either r = 0 or 
deg r < degm. This gives r = (g — g)m, so g = q € Z[z]. Hence, f(x?) = mg holds 
in Z|z], and so taking the coefficients modulo p, 


mg = f(x”) = f(x) in Z, [2]. 


Thus, m and f have a common irreducible factor in Z,p[z]. On the other hand, 
z” —1=4,h for some h € Zz], so 


a—-l=mfh in Z,[z}. 


Hence, z” — 1 has a multiple zero in Z,[z], a contradiction as p does not divide n. 
| 


Example 4. Note that it is essential that ©, is taken over Q in Theorem 4. For 
example, &g = 2” ~ x +1 becomes &g = (x + 1)? in Za[z]. 


Exercises 10.4 


1. Find (a) Bz, (b) Pio, (c) Bro, (d) Gis, and (e) Dig. 

. If pis a prime, show that ®,n =1+a74 077 4---+a(?-D4@, where g = p"™?. 

. Ifn > 3 is odd, show that $2, = ©,(—z). [Hint: ®2(x) = —6,(—2).] 

. Show that n = > al p(d) ifn > 1 (yw is the Euler function). [Hint: Theorem 1(2).] 

. If gcd(m,n) = 1, prove that 2" — 1 and (c” — 1)(x” —1) have the same splitting 
fields over Q. : 

. (a) Show finite subrings of division rings are division rings. 
(b) If R is a division ring of characteristic p + 0, show that any finite subgroup G of R* 
is cyclic. Is this true if char R = 0? [Hint: Regard Z, C R and if G = {g1,92,--* 9n}; 
consider Ro = aa Tigi | Te E Zp | 


ap wh 


o>) 


446 


7. 


10. 


10. Galois Theory 


The Mobius function p : Zt > {0, 1, —1} is defined by 

w(1) = 1, 

p(n) = 0 if n = p*m for some prime p, 

p(n) = (—-1)* ifn = pipe++-pe, where pi,+++ , pz are distinct primes. 
1, ifn=1, 


Show that ) > p(d) = 0, ifm>1. 


din 


. (a) Let maps a: Zt + Z* and 8: Z*+ — Z* be related by a(n) = D714, 8(d). Show 


that 6 is given in terms of a by 
B(n) =Lajnps (5) a(d) = Ly ,M(d)o (3). 
This is called the Mobius inversion formula. [Hint: din and c|(n/d) = deln © ¢|n 
and d|(n/c).] 
(b) Prove that ©, = [] ain 2" — 1)#("/4) | where yw is the Mébius p-function. 
[Hint: Exercise 7; use a formal logarithm.] 


. Let n= py'p?-+--pt, where p; are distinct primes, and let m= pipo--+p,. Show 


that ©,(c) = ®,,(2"/™). [Hint: Exercise 8(b).] 
If p is an odd prime and p does not divide n, show that ©,,(z)-®,(x) = ®,(a”). 
[| Hint: Exercise 8(b).]| 


Chapter 11 


Finiteness Conditions for Rings 
and Modules 


A scientist worthy of the name, above all a mathematician, experiences in his work the 
same impression as an artist; his pleasure is as great and of the same nature. 


—Henri Poincaré 


The field C of complex numbers is a two-dimensional vector space over R. A ring 
that is also a vector space over a field F’ is called an algebra over F. In the nineteenth 
century many attempts were made to describe the division algebras that are finite 
dimensional over R, and in particular those (like C) that are fields. Certainly the 
three-dimensional examples would have had applications to physics, but after look- 
ing in vain for such an algebra, the first success came in 1843 when W.R. Hamilton 
discovered the ring H of quaternions, a four-dimensional algebra that, surprisingly, 
was not commutative. It was not until 1878 that G. Frobenius showed that there 
is no three-dimensional example, and that the only possible associative exam- 
ples are R, C, and H. Meanwhile, G. Grassmann described many examples that 
were rings but not necessarily division rings, of which the matrix algebras, con- 
structed in 1858 by A. Cayley, were an important special case. The next event 
in the development of the theory came in 1907 when J.H.M. Wedderburn gave 
the first characterization of the simple finite dimensional algebras. Finally, in 1927, 
E. Artin extended Wedderburn’s theorem to a result about rings by eliminating the 
dependence on the fact that the ring is finite dimensional as an algebra, replacing 
it with finiteness conditions on the set of left ideals. These seminal results mark the 
beginning of the theory of noncommutative ring theory, and we prove them in this 
chapter. 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


447 


448 11. Finiteness Conditions for Rings and Modules 
11.1 WEDDERBURN’S THEOREM 


If F is a field, a ring R is called an F-algebra if it is a vector space over F and 
t(ab) = a(tb) for all t € F and all a,b € R. Hence, C is a two-dimensional R-algebra 
and the ring M,,(F) of all n x n matrices over F is an n?-dimensional F-algebra. 
The ring H of real quaternions is a four-dimensional algebra over R that has the 
distinction of being a division ring (every nonzero element is a unit). Wedderburn’s 
theorem asserts that if R is a simple, finite dimensional algebra then R = M,,(D) 
for some n > 1 and some division ring D. We are going to prove an extension of 
this theorem that removes the restriction that R is an algebra. The first task is to 
find an appropriate “finiteness condition” on R to replace the finite dimensional 
requirement. 


Finiteness Conditions 


If R is a ring, recall (Section 7.1) that a left R-module is an additive abelian group 
with a left R-action re € M,r € R, x € M, such that the axioms for a vector space 
are satisfied. In the nineteenth and early twentieth centuries, most of the modules 
studied were finite dimensional vector spaces. Emmy Noether, and later Emil Artin, 
realized that the right way to extend this finite dimensional condition was to use 
finiteness conditions on the set of submodules. Lemma 1 below identifies the most 
important of these. 

If S is a nonempty set of submodules of a module M, then N € S is said to be 
maximal in S if N C K € S implies that K = N. Similarly N is called minimal 
in Sif N > K € S implies that kK = N. For example, the maximal ideals in a ring 
are the maximal members of S = {A | A is an ideal of R and A + R}. 


Lemma 1. If M is a module, consider the following conditions where the Kk; 
denote submodules of M. Then (ACC) + (MAX) and (DCC) + (MIN). 


(ACC) If Ky C Ko C:-- then K, = Ky41 =+:- for some n. 
(MAX) Every nonempty set of submodules of M has a maximal member. 


(DCC) If Kk, D Ko D--+ then K, = Kn41 =--- for some n. 
(MIN) Every nonempty set of submodules of M has a minimal member. 


Proof. If (MAX) holds then (ACC) holds where K,, is any maximal member of 
{K; |t > 1}. Conversely, suppose that S is a nonempty set of submodules of 
that has no maximal member. Choose Ky, € S. Since Ky is not maximal in S there 
exists Ko € S such that Ky C Ko. But Ko is also not maximal in S so we obtain 
K, C Ke C Kz for some K3 € S. This process continues indefinitely!”® to violate 
(ACC). This proves that (ACC) + (MAX); the proof that (DCC) = (MIN) is 
analogous. | 


The notations ACC and DCC refer to the ascending and descending chain 
conditions, on submodules, respectively. A module M is called noetherian if it 
has the ACC, and M is called artinian if it has the DCC. If R is an F-algebra with 
unity 1 then any left module pM becomes an F-space via the action tx = (t1)x 


129 Actually this requires a set-theoretical theorem called transfinite recursion. This is discussed 
in Appendix D. 


11.1. Wedderburn’s Theorem 449 


for allt € F and x € M. Hence every submodule of M is a subspace, so finite 
dimensional modules are both noetherian and artinian,1° 


Example 1. Regarded as a module over itself, Z is noetherian but not artinian. 


Proof. Z is not artinian as Z> Z2> Z45--- shows. For the converse, suppose 
K, © Ky €--: are subgroups of Z. Then K = UK; is a subgroup and so K = Zk 
for some k € Z by Theorem 7 §2.4. But then k € K,, for some n and it follows 
easily that K, = Knii =--- = K. Hence Z is noetherian. O 


If pe€Z is a prime, let X = toe €Q|meEZ, k>0}. This is an additive 
subgroup of Q, and one verifies that the groups ZC Zs Cc Vi C:+--CX are the 
only subgroups of X containing Z (Exercise 9). The factor group Z,~. = X/Z is 
called the Priifer p-group, and this shows that the only subgroups of Zp» are 


0C Za, C Zag C Za3 C+: C Lyx, 


where a, = 4 +Z for each k. Furthermore, o(z,) = p* and papy, = xp hold for 
each k. Hence Zp» is an infinite group, but every proper subgroup is finite. Clearly 


Example 2. The Priifer group Zp» is artinian but not noetherian as a Z-module. 
The following basic properties of the ACC and the DCC will be needed. 


Lemma 2. If N C M are modules then M is noetherian (artinian) if and only if 
the same is true of both N and M/N. 


Proof. We prove the noetherian case; the other is analogous. If M has the ACC 
then N has the ACC because submodules of N are submodules of M. On the other 
hand, every submodule X of M/N has the form X = K/N for some submodule K 
of M containing N by Theorem 5 §8.1. Hence every ascending chain in M/N takes 
the form K,/N C K2/N C-:: where K; C Ky C-:- in M. It follows that M/N 
has the ACC. 

Conversely, let ky C Kg C--: be a chain from M. If both N and M/N have 
the ACC then the chains NOK; CNN K2C-:: and a Cc aoe Niko c. 
both terminate, so there exists n >1 such that N nN K, =NO Knyt = "°° aid 
a = Abad =-++- (whence N+ K, = N+ Kn4i =-:-). Since K; C Kut for 
sae i, we fae Heal N(R +N) = Ki + (Ki41 9N) by ihe modular law (Theorem 
2 §7.1). So if 7 > n we have 


Kiar = Big N(BG41 + N) = Kg NG AN) = K+ (Ki NN) 
= Ki, +(KinN)= 
This proves that MW is noetherian. 


Corollary. A sum M = M,+--:+M, of modules is artinian (noetherian) if and 
only if the same is true of each M;. 


Proof. We prove only the artinian case, the other being analogous. If M is artinian, 
so are its submodules M; by Lemma 2. Conversely, if n > 2, assume inductively 
that K = Mj)+--:-+M,, is artinian. By Corollary 2 of Theorem 1 §7.1, it follows 


180The ACCP designation used extensively in Chapter 5 is just the ACC applied to the set of all 
principal ideals in an integral domain. 


450 11. Finiteness Conditions for Rings and Modules 


that M/K = (M, + K)/K = M,/(M, 10 K) is also artinian by Lemma 2, being an 
image of M,. Hence, M is artinian, again by Lemma 2, as required. | 


Let M,,...,M, be modules. The external direct sum M=M, @6---®M, is 
artinian (noetherian) if and only if the same is true of each M; by the Corollary 
because M is a sum of submodules Mj isomorphic to the Mj. 

Because the Z-action on an abelian group is naturally written on the left, we 
discussed only left R-modules in Chapter 7. However, an additive abelian group M 
is called a right R-module (written Mp) if R acts on the right: That is, for any 
r€ Rand ze€ M, an element zr € M is defined such that 


(e+y)r=art+yr, x(r+s)=ar+z2s, (xr)s=a(rs), and cl=« 


hold for all x,y € M and all r,s € R.1%! With this, the definition of submodules and 
homomorphisms of right modules are analogous to those for left modules. Moreover, 
the analogues of theorems about left modules in Section 7.1 go through verbatim 
for right modules. Note that the distinction does not matter for a commutative 
ring R because a left R-module M becomes a right module if we define the action 
by c -r=rez forallre Randre M. 

A ring R is called left artinian (left noetherian) if rR is artinian (noetherian), 
with similar definitions on the right. The submodules of rR (Rp) are the left (right) 
ideals. 


Example 3. In each case consider the subring R of M2(R). 
(1) AS a “| is left noetherian but not right noetherian. 


(2) R= [Re 2 is left artinian but not right artinian. 


Solution. as 
(1) If A= lo a then A is an ideal of R and R/A=Z x Q is noetherian as 


a ring by the Corollary to Lemma 2 (verify). Hence, R/A is noetherian as a left 
R-module (the left R-submodules of R/A are just the left ideals of the ring R/A). 
Since pA is also noetherian (verify), it follows by Lemma 2 that R is left noetherian. 


0 0 : 
But the sequence BE ‘| G i E : 3 C --- of right ideals shows that FR is 
Z 0 31Z 0 1Z 0 
not right noetherian. This proves (1). 
(2) Observe that R is not finite dimensional as a Q-space (otherwise R would 


be countable contradicting Cantor’s theorem from set theory). Hence, there exist 


Q-subspaces X, D X_gD--: in R. But then & i D ky | >--+. are right 
1 2 

ideals of R, proving that R is not right artinian. The proof that R is left artinian 

is analogous to the argument in (1). O 


We note in passing that, despite Example 3, every left artinian ring is automat- 
ically left noetherian; this result is called the Hopkins—Levitzky theorem, proved 
independently in 1939 by Charles Hopkins and Jacob Levitzki. 


A left R-module gM is said to be simple if it is nonzero and satisfies the 
following equivalent conditions: 

(1) The only submodules of M are 0 and M. 

(2) Re=M for allO#2EM. 


131 These are the analogues of the axioms M1-M4 in Section 7.1. 


11.1. Wedderburn’s Theorem 451 


Thus, the simple Z-modules are the cyclic groups Z, where p € Z is a prime. Every 
simple module r™ is principal, that is, M = Re for somexe€ M. If R=Disa 
division ring the converse holds: If pM = Dz is simple, then M & pD viadz - d. 
In particular, the simple modules over a field are the one-dimensional vector spaces. 

As for groups, a series M = My DM, >-:->M, =0 of submodules of a 
module M is called a composition series of length n for M if all the factors 
M;/Mj41 are simple modules (see Section 9.1). 


Theorem 1. Let M #0 be a module. 
(1) M has a composition series if and only if it is both noetherian and artinian. 
(2) Jordan—Hélder Theorem. Any two composition series for M have the 


same length, and the factors can be paired so that corresponding factors 
are isomorphic. 


Proof. 

(1) If M=MpM,>D--:>M, =0 is a composition series then Mp-1 
and M,-2/M,,-1 are both simple, so M,_2 is noetherian and artinian by Lemma 
2. Then the same is true of M,_3 because M,-3/M,p-2 is simple. Continuing we 
see that every Mj, (including Mp = M) is noetherian and artinian . 

Conversely, let M be noetherian and artinian. Since M is artinian, let K; C M 
be a simple submodule. If Ky # M, choose Kz minimal in the set {kK | K D> Ky}. 
Then Ko > K, > O and K2/K;, is simple. If Ko # M, let K3 > Kz D K, D 0 where 
K3/Ko2 simple. Since M is noetherian, this process cannot continue indefinitely, so 
some K, = M and we have created a composition series. 

(2) The analogue of the proof of the Jordan-Hélder theorem for arbitrary groups 
(Theorem 1 §9.1) goes through. A 


The length n of any composition series for M is called the composition length 
of M and denoted length(M). Note that Lemma 2 and Theorem 1 combine to show 
that, if K C M are modules, then M has a composition series if and only if both 
K and M/K have composition series. Moreover, in this case the proof of Theorem 
2 §9.1 goes through to show that 


length(M) = length(K) + length(M/K), (*) 


and that the composition factors of M are exactly those of K and M/K. 


The finitely generated vector spaces over a field are called finite dimensional, 
and an Theorem 6 §6.1 shows that they all have a finite basis. The same is true for 
any division ring, but we give a different proof using the Jordan—Hoélder theorem. 


Corollary. Let D be a division ring and let pM be a module. Then 
(1) M is finitely generated if and only if it has a finite basis. 
(2) Any two bases of M contain the same number of elements, say n. 
(3) Every nonzero submodule of M has a basis of at most n elements. 


Proof. (1) Since M is finitely generated, let n > 1 be minimal such that M has a 
set {71,-++ ,2,} such that M = 0;Da,. If Xidjx; = 0 and d, #0 for some k, then 
(since a € D) it follows that M = %;4;,Dz2; contradicting the minimality of n. 
So {z1,...,2n} is independent and hence is a basis of M. This proves (1). 


452 11. Finiteness Conditions for Rings and Modules 


(2) If {21,...,@n} is a basis, then M = Dz; @::-® Dap, where Dz; is simple, 
and we obtain a composition series 


MD Da. ®@-:-® Day, D-+-: D Datn-1 8 DEn D Day D0 


for M. Hence n is the composition length of M, proving (2). 
(3) If K CpM is a submodule, then K has a composition series by Lemma 2 
and Theorem 1, and the composition length is at most n by (*). i 


If D is a division ring, the number of elements in any finite basis of a module pM is 
called the dimension of M and denoted dim M.1°? With this, most of the theorems 
about finite dimensional vector spaces (Section 6.1) go through for finitely generated 
modules over D. 


Endomorphism Rings 


If R is a ring and M and N are two R-modules, recall that a map a: M — N is 
called R-linear (or an R-morphism) if a(z + y)=a(z) + a(y) and a(rz)=ra(z) 
for allz,y € M and allr € R. Many results about rings (in particular, Wedderburn’s 
theorem) arise from representing them as rings of R-linear maps. 

If M and N are two modules write 


hom(M, N) = {a|a:M —N, a is R-linear}. 
If a, 8 € hom(M, N), define a + 6 and —a by 
(a+ B)(x) =a(z)+B(z) and (-a)(z)=—-a(z), for allze M. 


These are R-linear and make hom(M,N) into an abelian group. Furthermore, 
composition of maps distributes over this addition: 


y(a+8)=ya+76 whenever M “8 3 K are R-linear, 
(a+ 8) =ad+ 65 whenever L 4, © N are R-linear. 


All these routine verifications are left to the reader. . 
Our interest here is in a special case: If M is any module, an R-linear map 
a:M — M is called an endomorphism of M, and we write 


end M = hom(M, M). 


The additive abelian group end M becomes a ring, called the endomorphism 
ring of M, if we define addition as above and use composition of maps as the 
multiplication. Again we leave to the reader the routine verifications of the ring _ 
axioms, and that end M = end N whenever M & N. Note that the unity of end N 
is the identity map ly. 

If S is aring, denote the ring of all n xn matrices over S by M,,(.5). Wedderburn’s 
theorem asserts that certain rings are isomorphic to M,,(D) where D is a division 
ring. The next result shows how matrix rings arise as endomorphism rings. If M is 
a module, recall that 17” denotes the external direct sum of n copies of M. 


Lemma 3. Let pM be a module and write S = end M for the endomorphism ring. 
Then end(M") = M,(S) as rings for each integer n > 1. 


182Tn fact, every nonzero module over a division ring has a basis (and so is free). The proof 
requires Zorn’s lemma (Example 2, Appendix C). 


11.1. Wedderburn’s Theorem 453 


Proof. Identify M" with the set of n-tuples from M, and let M3 M” “3 
be the standard maps: o;(x) = (0,...,z,...,0), where x is in position i, and 
(21, £,..-;2n) = Zi. Then one verifies that 

Loym; = lyn = and TjOi = dijlu, for all 7 and i, 


where 6;; =0 or 1 according as i#j or i=j.4%3 Given a € end(M™”), we have 
mao; € S for all i and j, so we define 


6:end(M") > M,(S) by 0(a) = [mao;]. 
It routine to see 0(a + 8) = O(a) + 6() for all a and G, and that 6(1y¢n) = I, the 
identity matrix. The (i, 7) entry of the matrix 0(a)@(G) is 
Dn (maon)(TeBo;) = TO(LeoKT,) BO; = MABa;, 
so 6(a)0(3) =6(af). Thus, 6 is a ring homomorphism; we claim it is an isomorphism. 
If 0(a) = 0 then m;a0; = 0 for all i and j, so 
a= lyr alyn = (X404m;)a(5057;) = 24,7 04( MOO; ) I; = 0. 
This shows that @ is one-to-one, and it remains to show that 6 is onto. To this end, 
let [y:;] € M,(S), and define a: M" > M™ by 
O(©1,£2,...,2n) = (Ve yre(we), Leyoe(we), ..., Dene (ze). 
Then, for every « € M we have 


14,5 (x) > m(a(0, on ,0)) = Wi (V15(@), Y25 (2), sey Inj (x)) = Vij (@). 


It follows that mao; = yj for all i and j, that is 0(a) = [y%,]. This shows that @ is 
onto, and so proves the lemma. a 


If V is an n-dimensional vector space over a field F, then V & F”™ so Lemma 3 
shows that end(V) & M,(F), a familiar fact from linear algebra. 


We say that a mapping W 4, N acts as a left operator «++ B(x) because we 


write @ on the left of its argument. If at N S K, we have the composite map 
a: M — K where af(x) = a[G(x)| for all c € M. Hence, because a and £ are left 
operators, the notation a@ means “first G then a”. This is somewhat unfortunate 
because the order gets reversed, but the use of left operators is very common. On 
the other hand, we could write a map M 2, .N on the right of its argument, so that 
xt+ 2x. In this case the composite M In K is given by «(Ga) = (f)a, so 
we write it as Ga: M — K. In particular, Ga and means “first @ then a” in the 
same order as the arrows. Hence composition means a different thing depending on 
whether the maps act on the left or the right, and we must always be clear which 
we are using. Lemma 4 below gives a good illustration of the use of right operators. 


An element e € R is called an idempotent if e? = e. Note that 0 and 1 are 
idempotents in any ring, and they are the only ones in a division ring (in fact in a 
domain). If e € R is an idempotent, we define eRe C R as follows: 


eRe = fere|re€ R}={se€ R| es =s = se} =eRN Re. 


1339, is often called the Kronecker delta. 


454 11. Finiteness Conditions for Rings and Modules 


Then eRe is a ring with unity e, called the corner ring corresponding to e. The 


FOF ; 
P| where F is a 


name comes from the following example: If R = Mo(F) = ie 


field, and if e = F 01 , then e? = e and eRe = i. . ; 


These corner rings arise as endomorphism rings in a natural way. 


Lemma 4. If R is a ring and e? =e € R, then eRe & end(Re) where endomor- 
phisms of Re act as right operators. 


Proof. Given a € eRe, define a right operator 
Pa: Re — Re by zp, = za, for alla € Re. 


Note that za € Re because a € eRe C Re. Then p, preserves addition and, in fact 
Pa € end(Re) because (rz) pa = (rz)a = r(xa) = r(zpaq) for allr € Randa € Re.'** 
Hence, we may define 


6:eRe—end(Re) by O(a) = pa for each a € eRe. 


We show that @ is a ring isomorphism. We have (ec) = pe = 1, so 0 preserves 
the unity. If a,b € eRe, we have pars = Pa + pp (verify) so 6 preserves addition. 
Also pab = Papp because ZPa, = x(ab) = (xa)b = (xpa)pp = Z(Paps) for all x € Re, 
so @ preserves multiplication. Hence @ is a ring homomorphism. Moreover, @ is 
one-to-one because 6(a) = 0 implies that p, = 0, that is, ca =0 for all a € Re. 
Taking xz = e gives that ea = 0, so a = 0 because a € eRe. 

To show that @ is onto, let A: Re — Re be R-linear, and define a = eX. Then 
ae = a because a € Re, and ea = e(ed) = (e?)A = e\ = a. Hence a € eRe. Finally, 
if ¢ € Re then x = ze, so \ = (e)A = w(edA) = 2a = “pg. Since this holds for all 
x € Re, we have \ = pg = O(a), so 6 is onto. This completes the proof. | 


Taking e = 1 in Lemma 4 gives a very important special case. 


Corollary. If R is a ring then R = end(rR) as rings where the endomorphisms of 
RR are right multiplications by elements of R. 


Wedderburn’s Theorem 


Wedderburn’s theorem asserts that every simple, left artinian ring is isomorphic 
to a matrix ring over a division ring. The division ring comes from the next result, 
due in 1905 to Issai Schur. 


Lemma 5. Schur’s Lemma. Let M and N be simple modules. 

(1) Ifa: M —- N is R-linear, then either a = 0 or a is an isomorphism. 

(2) end M is a division ring. 
Proof. (1) Observe that kera and ima=a(M) are submodules of M and N, 
respectively. If a #0 then kera # M and a(M) #0, so kera = 0 and a(M) = N 


by simplicity. Hence a is an isomorphism, proving (1). Now (2) follows if N = M. 
| 


In 1907, J.H.M. Wedderburn proved the following important theorem for finite 
dimensional algebras. A ring R is called simple if the only ideals are 0 and R. 


184Tf we wrote pq as a left operator then pa(rx) = (ar)x and rp,(«) = (ra)x need not be equal. 


11.1. Wedderburn’s Theorem 455 


Theorem 3. Wedderburn’s Theorem. The following conditions are equivalent 
for aring R: 


(1) R is a simple ring that is left artinian. 

(2) R is a simple ring that has a simple left ideal. 

(3) R= M,(D) for some n > 1 and some division ring D. 
(4) The right—left analogues of (1) and (2). 


Furthermore, the integer n is uniquely determined by R, as is the division ring D 
up to isomorphism. 


Proof. (1)=>(2) is clear, and (4) follows by the left-right symmetry in (3). 

(2)=>(3). Let L be asimple left ideal of R, and recall that DR is the set of all finite 
sums of products ba where b € L anda € R. Then LR is an ideal of R containing L, 
so LR = R because R is asimple ring. In particular, let 1 = ba, + boa2 +++: + bran 
where n > 1, and where b; € L and a; € R for each 1. Assume that n is the smallest 
positive integer with this property. The reader can verify that 


R= La, + Lag+:++++ Lay. (*) 


Note that La; # 0 for each 2 because b;a; # 0 by the minimality of n. Hence L = La, 
because the map x +> xa, is a nonzero, onto, R-morphism DL — La;,, and L is simple. 
In particular each La; is simple. 

Now we claim that the sum in (*) is direct. By Theorem 3 §7.1, we must show 
that La, (4 4%La;) = 0 for each k. Suppose not. Then La, N (X;45%La;) is a 
nonzero left deal contained in Lay. Hence, La, M1 (X;4La;) = Lag because Lay, is 
simple. But then La, CU; 4 ,La;, and it follows that R = 4,4 ,La;, contrary to the 
minimality of n. Hence (*) is a direct sum. 

Since La; & L for each i, (*) gives RR = Of, La; = L”. But then Lemma 3 and 
the Corollary to Lemma 4 and give 


R=end(rR) = end(L") = M,,(end L). 


Since end L is a division ring by Schur’s lemma, (3) follows with D = end L. 

(3)=(1). The ring M,,(D) is simple by the Corollary to Theorem 7 §3.3 so, 
given (3), it remains to show that M,,(D) is left artinian. But M,(D) is a finite 
dimensional vector space over D (in fact the dimension is n”), and so is artinian 
as a D-space. Hence, the ring M,,(D) is left artinian because every left ideal is a 
D-subspace. This proves (1). 


Uniqueness. The fact that rR = DL” shows that n is the composition length of 
RR and so is uniquely determined by R. To show that D is uniquely determined, 
we prove that D © end K for any simple left ideal K of R. But the proof of (2)=-(3) 
shows that R & K”, and hence that L" & K”. We have maps L 3 L" 4 K" 3% 

where T is an isomorphism, cj is the inclusion, and the 7; are the projections. Then 
1,701 # 0 for some i because T01(L) # 0. Hence L = K by Schur’s lemma, and so 
. end & end LZ = D, as required. | 


Remark. The “left artinian” condition in (1) of Theorem 3 cannot be replaced 
with “left noetherian”. In fact, there exists a simple, left and right noetherian 
domain that contains no simple left ideal. It is called the first Weyl algebra, after 
Hermann Weyl, and can be described roughly as the ring of polynomials over R in 
noncommuting indeterminates x and y which satisfy the condition that cy — yx = 1. 


456 11. Finiteness Conditions for Rings and Modules 


This ring first arose in quantum mechanics as an algebra generated by position and 
momentum operators. 


Exercises 11.1 


1. Show that a module pM is simple if and only if M & R/L for some maximal left 
ideal D. 

2. Show that the following are equivalent for a ring R: (1) R is a division ring; (2) 
every principal module Rx # 0 is simple; (3) pF is simple. 

8. If aR = bR where a,b € R, show that there is an R-isomorphism o : Ra — Rb such 
that o(a) = 6. 

4A, Given pM and me M define \y,: R- M by Am (a) = am for alla ce R. 

(a) Show that A,, is R-linear for each m € M. 
(b) Show that m+ Am is an abelian group isomorphism M — hom(,R, M). 

5. If R is left noetherian (left artinian), show that the same is true of the corner ring 
eRe for any idempotent e = e? in R. [Hint: If L C eRe is a left idel, consider RL.| 

6. Show that the following are equivalent for a ring R: (1) R is left noetherian (left 
artinian); (2) M,(R) is left noetherian (left artinian). (Hint: M,(R) is a free left 
R-module.] 

7. If R is left noetherian, show that every finitely generated left R-module is left 
noetherian. [Hint: Lemma 2 and Theorem 5 §7.1.] 

8. Let R be left artinian. If X is a subset of a left module pM, define the annihilator 
ann(X) = {a € R| ax =0 for all  € X}. 

(a) Show that ann(J/) = ann(X) for some finite subset X C M. 
(b) If ann(M) = 0, show that zR is isomorphic to a submodule of M. 

9. Complete the solution of Example 2 as follows: If p€ Z is a prime and we write 
X= {Zee Q|mé€Z, k > 0}, show that the only subgroups of X that contain Z are 
De for k > 0. [Hint: If Zc Y Cc X, Y a subgroup of X, choose zx in Y where m 
and p” are relatively prime and n is maximal.] 

10. Let K C M be modules. If K € N C M where N is a submodule, show that N/K is 
a submodule of M/K, and that every submodule ¥ of M/Ix has the form ¥= N/K 
for some submodule N D K. [Hint: Theorem 5 § 8.11] 
11. Let pM be a module and let 7?= 7 €end M. 
(a) Show that M = 1(M)@ ker. 
(b) If M =N@K, N and K submodules, show that 7?= a €end M exists such 
that N = a(M) and K =kerz. 
(c) Show that 1(M) = ker(1—7) and (1 —7)(M) = kern. 
12. Show that a module M is noetherian if and only if every submodule is finitely 
generated. 
13. Call a module M finite dimensional if it contains no infinite direct sum of nonzero 
submodules, and call M indecomposable if it is not a direct sum M = A @ B where 
A and B are both nonzero submodules, 
(a) Show that M is finite dimensional if it is either noetherian or artinian. 
(b) If M is finite dimensional show that M = Ni@N2®---@Nx where each N, 
is indecomposable. 
14. Suppose R is a ring for which rf is finite dimensional (preceding exercise). If ab = 1 
in R show that ba = 1. [Hint: Write ba = e and show that e?= e and rR = Ra= Re.] 


11.2. The Wedderburn—Artin Theorem 457 


15. Show that the converse of (2) of Schur’s lemma is not true. That is find a module 
K such that end K is a division ring but K is not simple. [Hint If F is a field, 


; FOF 0 0 ; 
consider R = le | ,lete= E ‘| , and consider Re.] 


16. Extend Lemma 4 as follows: If e?=e and f?= f are idempotents in a ring R, then 
eRf ~ hom(Re, Rf) as additive abelian groups. [Hint: If a€ eRf and x € Re then 
wae Rf] 

17. Let M be a module and let a € end(M). 

(a) If M is noetherian and a is onto, show that a is one-to-one. 
(b) If M is artinian and a is one-to-one, show that a is onto. 
[Hint: We have chains ker (a) C ker (a”) C--- and a(M) D a°(M)2--- |] 

18. Fittings Lemma. Suppose that the module M is both artinian and noetherian. 
If a € end(M) show that there exists n > 1 such that M = a"(M)@ ker(a”). [Hint: 
ker (a) C ker (a?) C++» and a(M) Da®(M)2D--- |] 

19. Let R be an n-dimensional algebra over a field F, and fix a basis {u,,ug,...,u,} of 
R. Define 0: R > M,,(F) by O(r) = [r,;], where usr = UF rity. 

(a) Show that @ is a one-to-one ring homomorphism. 

The image 9(R) is called the regular representation of R; in each case find it. 
(b) R=C, F=R, basis {1,¢}. 

(c) R=H, F =R, basis {1,2, 7, k}. 

(d) Basis {1,u} where 1?=1, lu=u=ul and u?=0. 

(e) Basis {1,u} where 1?=1, lu=u-=ul and u?=1. 


11.2. THE WEDDERBURN-ARTIN THEOREM 


The proof of Wedderburn’s theorem (Theorem 3 §11.1) shows that if R is a simple 
ring containing a simple left ideal L, then R= LD, @ Lo ®::-@ Ly where the L; 
are isomorphic, simple left ideals (in fact L; & L for each 1). Wedderburn actually 
treated the case where the L; are not necessarily all isomorphic, albeit in the special 
case when R is a finite dimensional algebra. We deal with this general case in this 
section, in the context of rings. 

The work necessitates a look at modules that are the sum of a (possibly infinite) 
family of simple submodules. This investigation provides an extension of the well- 
known theory of vector spaces over any division ring. However, not surprisingly, 
transfinite methods are required. 

The technique we need is called Zorn’s lemma. Let S be a nonempty family of 
subsets of some set. A chain from S is a (possibly infinite) set {X;|i¢JI}CS 
where either X; C X; or X; C X; for any i,7 € J. We say that S is inductive if the 
union of any chain from S is again in S. Zorn’s lemma asserts that every inductive 
family S has a maximal member, that is a set Y € S such that Y C Z € S implies 
Y = Z. A more general version of Zorn’s lemma is discussed in Appendix C. 


Semisimple Modules 


Let R be a ring, and let {M; | i € I} be a (possibly infinite) family of submodules 
of a module M. The sum };<;M; of these submodules is defined to be the set of 


458 11. Finiteness Conditions for Rings and Modules 


all finite sums of elements of the M;; more formally 
Vier My = {a1 +++-+2m|m2>1, 2; € M; for some j € I}. 


This is a submodule of M that contains every M;. The following lemma is the 
extension of Theorem 3 §7.1 to the case of infinite sets of submodules. 


Lemma 1. The following are equivalent for submodules {M; | i € I} of a module: 
(1) Every element of Dicer M; is uniquely represented as a sum of elements of 
M; for distinct 7. 
(2) The only way a sum of elements from distinct M; can equal 0 is if each of 
the elements is zero. 


(3) Dies M; is direct for every finite subset J C I. 


Proof. (1) = (2). Ifa1 ++--+a¢%m =0=0+---+0 then each x; = 0 by (1). 

(2) = (3). If N= Mj, +---+Mj;,, where the i; are distinct, then N is direct 
by (2) and Theorem 3 §7.1. 

(3) > (1). Ife@e M let c= 214+: +%m =4y1 +:+++ Ye where 2; € Mj, and 
yk © M,,. Inserting zeros where necessary, we may assume that x; and y; are in M,, 
for each 7, and that these M;, are distinct. Hence, 21 + +--+ 2m =Yy1 +++: +Yp in 
M;, ®:::®M,, for distinct 1; by (3). Thus, x; = y; for each i by Theorem 3 §7.1, 
proving (1). | 


In this case we say that “jc,M; is a direct sum, and we write it as @jerM;. 
One reason for introducing these infinite direct sums here is that they provides the 
language needed to generalize some of the results about vector spaces. 

If D is a division ring and pM is an module, let B = {b; | 1 € I} be a (possibly 
infinite) set of nonzero elements of M. Then B is said to span M if M = Dyc7Dd;. 
The set B is called independent if Sr;b; = 0, r; € D, implies that r; =0 for 
each i, equivalently if M = @;<7Db; (since D is a division ring). A set B that is 
both independent and spans M is called a basis of M. Furthermore, each principal 
module Db; is simple (again because D is a division ring). In this form, these facts 
can be stated much more generally. 

It is shown in Appendix C that every module M over a division ring D has 
a basis. By the above discussion this means that M is a direct sum of simple 
submodules. In general, a module pM over a ring R is called semisimple if it is 
the direct sum of a (possibly infinite) family of simple submodules. The following 
example shows that rR is semisimple if R is the 2 x 2 matrix ring over a division 
ring. 

DD 


Example 1. Let D be a division ring, and let R= M2(D)= & a 


y= ls 0] and Ly = ks 2 , show that DL, and LD» are each simple left ideals 


of R, that R = Ly @ Lo, and that L, & Le as R-modules. 


jet 


Solution. Ly and Lz are left ideals by the definition of matrix multiplication, and 


it is clear that R= £1 +L and £1 NL. =0. Hence R = Li @ Lo. We show that 
a 0 


I, is simple; the proof for Lg is similar. So let O#a2 = i | € Li, say a #0. 
: r 0 a) at 0 0 
Given [; | € I, we have re ie lS | El,, so Ly=Rz. 


A similar argument shows that D, = Ra when b + 0. 


11.2. The Wedderburn—Artin Theorem 459 


0 1 tfc : 
Finally, if a= ¢ HI and b= i > | then it is a routine matter to show that 
the maps L; — Lz given by z+ xa and Ly - Ly, given by z+ «xb are mutually 


inverse R-isomorphisms. Hence DL; & Lo. O 


We are going to give several characterizations of semisimple modules, and the 
following notion is essential. A module M is said to be complemented if every 
submodule K is a direct summand, that is if M = K @N for some submodule N. 
Hence every simple module is complemented, as is every finite dimensional vector 
space (Theorem 8(3) §6.1). A submodule N of a module M is called proper if 
N # M, and a maximal proper submodule is called simply a maximal submodule. 


Lemma 2. Let M be a complemented module. Then 
(1) Every submodule N of M is complemented. 
(2) Every proper submodule of M is contained in a maximal submodule. 


Proof. (1) If KCN lett K@ Ki =M. Then N=NO(K@K1)=KO(NNKj) 
by the modular law (Theorem 2 §7.1). 

(2) Ifa ¢ N, write S= {P| P is a submodule, N C P and « ¢ P}. Then S is 
nonempty and inductive (verify) so, by Zorn’s lemma, choose a submodule P maxi- 
mal in S. Since M is complemented, let WM = P @ Q; we show that P is maximal by 
showing that Q is simple. If not, let A C Q where A #0, Q. By (1) writeQ =AQ@B. 
Then P @ A and P @ B both strictly contain P, so neither is in S. Hence z lies in 
both by the maximality of P, say = p; +a = po +b where pi,po € P,a € A and 
b€ B. Since a € Q and b € Q, and since P ® Q is direct, it follows that a = b. But 
thena=bE ANB=0, so c= p, € P, a contradiction. | 


Lemma 3. Let M = XicrM; where each M; is simple. If N is any submodule of M 
there exists JCI such that M = N ® (@jeM;). In particular, M is complemented. 


Proof. If N=M take J=@. If N#M then some Mj; ZN so NNM;=0 
(because M; is simple). Hence, S is nonempty where 


S={JCI|N+ Dic7M; is a direct sum}. 
Let {J;} be a chain from S, and write J = Uj. To see that J is in S, let 
O=2+%i, th Le EN + dies MG, 


where « € N and each 2;, € M;,. Since {J;} is a chain, there exists some J; 
containing all these i; so (since J_ € S) it follows that x = «;, = 0 for each t. Hence 
N + XSie7M; is direct, which shows that J € S. 

Hence, by Zorn’s lemma, S has a maximal member Jo. If L = N + Lies, Mi, 
it remains to show that LD = M. Since M = 0M,, we show that M; C LD for each 
i € I. This is clear for all 4 € Jo. If ig € I — Jo, then Jo U {io} ¢ Sso L + M,, is not 
direct, that is, DN M;, #0. Hence, LM M;, = Mj, as Mj, is simple, so M;, C L, 
as required. | 


With this we can characterize the semisimple modules. 


Theorem 1. The following conditions are equivalent for a module M + 0: 
(1) M = XierM;, where each M; is a simple submodule. 


460 11. Finiteness Conditions for Rings and Modules 


(2) M = @ierM;, where each M; is a simple submodule. 
(3) M is complemented. 


(4) Every maximal submodule of M is a direct summand, and every proper 
submodule is contained in a maximal submodule. 


Proof. (1)=>(2) and (2)=>(3). These are by Lemma 3, the first with N = 0. 

(3)=>(4). This is clear by Lemma 2. 

(4)=(1). Let S denote the sum of all simple submodules of M (take S = 0 if 
there are none). Suppose S'+ M. By (4) let S C N where N is maximal in M, and 
write M = N @ K for some submodule K, again by (4). Then K is simple (because 
K2M/N) sok CSCN, contrary to KM N=0.S0 S=M and (1) follows. & 


A module M + 0 is called semisimple if it satisfies the conditions in Theorem 1, 
We also regard 0 as a semisimple module. Hence every module over a division ring 
is semisimple, and the ring in Example 1 is semisimple as a left module over itself. 
Lemma 3 gives 


Corollary 1. If M is semisimple so also is every submodule and image of M. More 
precisely, if N C M = @ierM; where each M; is simple, then 


N@=@iesyM; and M/N = @ier-3M; for some J CI. 


Note that we need not have N = @jez3M; in Corollary 1. As an example, let 
M=F?=FQF where F is a field. Then M = M, @ My where Mj = FOO 
and Mz = 0@ F. But if N = {(a,a)|ae F} then M; Z N fori =1,2. 


Corollary 2. If M is finitely generated then M is semisimple if and only if every 
maximal submodule is a direct summand. 


Proof. By Theorem 1, it remains to show that every proper submodule K Cc M is 
contained in a maximal submodule of M. To this end, let S={pX | K C X #M}; 
we show that S contains maximal members (they are then maximal in M). So let 
{X; | k € I} be a chain from S and write X = Uje;X;. By Zorn’s lemma it suffices 
to show that X # M. But if X = M, let M = Rm, +--.+ Rm, (as M is finitely 
generated). Then each m: € X;, for some i; € J. But the X; are a chain so there 
exists k € I such that X;, C X; for each i. Hence each m: € Xz, so MC Xx, a 
contradiction. Hence X # M after all. 


Lemma 4. The following are equivalent for a semisimple module M : 

(1) M is finitely generated. 

(2) M=M, ® M2 @---®M,, where each M; is simple. 

(3) M is artinian. 

(4) M is noetherian. 
Finally the decomposition in (2) is unique: If M = N, ® No ®-+: ® Nm, where each 
N; is simple then m = n and (after relabeling) N; & M; for each i. 
Proof. (1)=>(2). If M = ©;M; where the M; are simple, the generators of M all lie 
in a finite sum M;, @ Mi, ®--»® Mj, by (1). 

(2)>(3) and (2)$(4). By (2) MD Mo@---@M,5:-:-DM, D0 is a 
composition series for M so (3) and (4) follow from Theorem 1 §11.1. 


11.2. The Wedderburn—Artin Theorem 461 


(3)=+(1) and (4)=(1). If M = @ierM; where I is infinite, we may assume that 
{1,2,3,-.-} CJ. Then (M; ® Mz © M3 ®---)> (Mz ® M3 @---)D--- contradicts 
(3) and M, Cc M, ® M2 C-:: contradicts (4). 

Finally, as we saw above, (2) gives rise to a composition series M with factors 
My, Mo,..., Mn. Hence, the last statement in the corollary follows from the Jordan— 
Hélder theorem (Theorem 1 §11.1). ] 


Note that Lemma 4 proves the uniqueness of the number of elements in a finite 
basis of a module over any division ring (so we can speak of dimension). In fact, 
a version of the uniqueness actually goes through for arbitrary infinite direct sum 
decompositions of a semisimple module, a result beyond the scope of this book. 


Homogeneous Components 
Before proceeding we need a technical lemma. 


Lemma 5. Let M = @icrM; with each M; simple and let K C M be a simple 
submodule, Then there exist 1, 72,...,%m in I such that K CM;,®M,, ®:::® Mi, 
and K = M,, for eacht =1,2,...,m. 


Proof. Choose 0 # y € K, and write y = yi, + Yig +::'+4i,,, Where 0 # H, € Mi, 
for each t. Then K 1 (Mi, ® M;, ®--- @ M;,,) #0 so, because K is a simple module, 
KOM, ®M;, ®:::-@M;,,. Because this is a direct sum, we can define a, : K— M;, 
as follows: If € K take a(x) = a;,, where = aj, + vi, +--+ +24, and ai, € Mi, 
for each j. Then a; is R-linear for each t, and a; #0 because a;(y) = yi, # 0. So 
az is an isomorphism by Schur’s lemma. @ 


If K C M are modules with K simple, define the homogeneous component 
H(K) of M generated by K as follows: 


A(K) =X{X CM |X isa submodule and X & K}. 


We say that H(K) =0 if M contains no copy of K. Hence, if K C M is simple 
then H(K) is a submodule of M containing every submodule isomorphic to K (and 
hence K itself). In fact every simple submodule of H (ic) is isomorphic to K. 


Lemma 6. Let K C M be modules with K simple. The following are equivalent 
for a simple submodule L C M: 

(1) L& kK. 

(2) H(L) = H(K). 

(3) LC A(k). 


Proof. (1)=>(2). This is because X = L if and only X = K by (1). 

(2)=>(3). Here L C H(L) = H(K) by (2). 

(3)=>(1). If DC H(K) = X{X CM |X = K}, then L = X for some X = K by 
Lemma 5. This proves (1). | 


A submodule N of M is said to be fully invariant in M if a(N) CN for 
every endomorphism a: M— M. Clearly 0 and M are fully invariant for every 
module M. Recall that the endomorphisms of rR are just the right multiplications 
by elements of R (Corollary to Lemma 4 §11.1). It follows that a left ideal L of R 
is fully invariant in rR if and only if L is an ideal of R. 


462 11. Finiteness Conditions for Rings and Modules 


We can now prove the main structure theorem for semisimple modules. For 
convenience, we say that submodules K and N meet if KN N #0. 


Theorem 2. Let M be a semisimple module. Then the following hold: 
(1) M is the direct sum of its homogeneous components. 
(2) Each homogeneous component of M is fully invariant in M. 


(3) Each fully invariant submodule of M is the direct sum of the homogeneous 
components it meets. 


Proof. Let {H(K;) | i € I} be the distinct homogeneous components of M. 

(1) M = YicrH(K;) because M is semisimple; to see that this sum is direct 
we show that H( Ky) N [D;4+H(K;)| = 0 for each t € I, and invoke Lemma 1. But if 
A(Ki) NE; «+H (K;)|] £0 then it contains a simple submodule L (being semisimple). 
Then L = Ky by Lemma 5 because LD C H(K;); and L& K; for some 1 #t by 
Lemma 5 because LD C 0 4+H(K;). Hence, kK; & L = K; and so H(Ky) = H(K;) by 
Lemma 6, contrary to the choice of the H(K;). 

(2) Consider H(K), where K C M is simple. If a € end M we must show that 
alH(K)] C H(K), that is a(L) C H(K) whenever L C M and L& K. But since 
L is simple, either a(L)=0 or a(L) S$ L2K by Schur’s lemma. Either way, 
a(L) C H(K). 

(3) Let the K; be as above, and let N C M be fully invariant. We show that 
N=X{H(K;) | H(K;) AN #0}. Every simple submodule of M is in some H(K;) by 
Lemma 5,so N C L{H(K;) | H(i) NN # 0} because N is semisimple. Conversely, 
if H(i) N #0 then there exists U C H(K;) NN such that UY Ky. If L&= K; 
is arbitrary then L = U by Lemma 5, so let 0: U — L be an isomorphism. Since 
M is complemented (Theorem 1), write M =U @ Uj, and define a: M — M by 
a(u+ uz) = o(u). Then L = o(U) = a(V) C a(N) CN, where a(N) C N because 
N is fully invariant. Since L = K; was arbitrary, it follows that H(K;) C N. | 


A semisimple module M is called homogeneous if it has only one homogeneous 
component, that is (by Lemma 6) if all simple submodules of M are isomorphic. In 
particular if D is a division ring and R = M2(D), then Example 1 shows that rR 
is a homogeneous, semisimple module. In fact much more is true as we shall see 
below. 


Free and Projective Modules 


The Wedderburn—Artin theorem shows that rings R such that rR is semisimple as a 
left R-module are isomorphic to finite direct products of matrix rings over division 
rings. Other equivalent conditions on R are also considered. 

Finitely generated free and projective modules were discussed in Section 7.1. 
However, the Wedderburn—Artin theorem involves arbitrary projective modules, so 
we pause to review these notions. Let pW be a module, and let B be a set of 
nonzero elements of W. We make three definitions: 

B is said to generate W if W =DycpRw. 

B is calledindependent if Yyr;w; = 0,r; € R, w; € B, implies each r; = 0. 

B called a basis of W if it is independent and generates W. 

A module pW that has a basis is called a free module. Note that the second 
condition above implies that Rw; = rR for each i € B (see the corollary to Theorem 


11.2. The Wedderburn—Artin Theorem 463 


1 §7.1). Hence, every free module is isomorphic to a direct sum of copies of R. The 
finitely generated free modules in Section 7.1 are examples, but free modules with 
bases of arbitrary size can easily be constructed. 

If J is any nonempty set and R is any ring, there exists a free R-module with 
a basis indexed by J. If t++ r; is any function I > R, write it as (ri). Hence 
(r;) = (s;) if and only if r; = s; for each i € I, and we call r; the ith component 
of (r;). We call (r;) an I-sequence from R. Define 


RY = {(r;) | ry = 0 for all but finitely many i € T}. 


If I has n elements it is clear that R() = R”. In general, R™ becomes a left 
R-module with componentwise operations: 


(ra) + (8s) = (ri + $i) and r(r;)=(rr;), for allr eR. 


If e; denotes the J-sequence with ith component 1 and all other components 0, then 
it is a routine matter to check that {e; | 1 € I} is a basis of RY), called the standard 
basis. Hence, RY = @jerRe;, where Re; & RR for each 7. 

Now let {x; | i € I} be a generating set for a module pM, that is, M = Diep Raj. 
Let W be a free module with a basis B = {w; | i € I} indexed by J (for example 
W = R®), Given r1,72,...,Tn in R, define 


B:W—>3M by B(rywy trowe tes: +rpWn) = 7181 + 72eQ +++ +7 nLn. 


Because B is a basis, this map is well defined, and it is evidently onto. Since every 
module M has a generating set (the set of all nonzero elements, for example), this 
proves the first part of 


Lemma 7. Let M denote any left R-module. 
(1) M is an image of a free module. 
(2) Ifa: M — W is onto and W is free, then ker a is a direct summand of M. 


Proof. We proved (1) above. As to (2), let B= {w;|i¢€ I} be a basis of W. As 
a is onto, let w; = a(a;), 2; € M for each 7. By the above discussion, there exists 
3:W —M such that B(w;) = x; for each 7. Then af(w;) = a(x;) = w; for each i, 
so a8 = lw because the w; generate W. But then M = kera @ B(W) by Lemma 8 
below, proving (2). | 


Lemma 8. If a: M — P is onto, the following conditions are equivalent: 
(1) There exists 8: P + M such that af = 1p. 
(2) kera@ is a direct summand of M, in fact M = kera ® B(P). 

In this case the map a is said to split. 


Proof. (1) > (2). If a =1p as in (1), let me M. As a(m) € P, we have 
a(m) = 1p[a(M)] = aBa(m). Thus m— Balm) €kera, so M =kera+ (P). 
But if me keranG(P), let m=f(p), p€ P. Then 0=a(m) = aG(p) =p, so 
m = B(p) = B(0) = 0. This proves that ker aM 6(P) = 0 and so proves (2). 

(2) = (1). Given (2), let M =kera @Q. Observe that P = a(M) = a(Q), so 
we define 8: P— M as follows: if pe P and p=a(q), ¢€Q, define B(p) = g. 
This is well defined because if p = a(qi), q1 € Q, then g~- qi € ON kera = 0. Now 
given p = a(q) in P then af(p) = a(g) = p, proving (1). O 


464 11. Finiteness Conditions for Rings and Modules 
A module rP is called projective if it satisfies the condition: 
If pM & P is R-linear and onto then ker a is a direct summand of M. 


Hence all free modules are projective by Lemma 7 (but see Example 3 below). Also, 
P is projective if and only if every onto R-morphism M — P splits. We need 


Lemma 9. If P is projective and Q & P, then Q is projective. 


Proof. Let a: M — Q be onto. Ifo : Q > P is an isomorphism then ca: M — P 
is onto so ker vw = ker oa is a direct summand of M. Hence, Q is projective. O 


Theorem 8. The following conditions on a module rP are equivalent: 
(1) P is projective. 
(2) P is isomorphic to a direct summand of a free module. i P 
(3) Ifa is onto in the diagram, then y exists such that ay = [. a 
(e4 


(4) If M S P is onto there exists P 2, M such that af =1p. 


Proof. (1)=(2). If W S P is onto, then W = kera @ Q for some Q by (1). Hence, 
P=a(P) = W/kera = Q, proving (2). 

(2)=(3). Since being projective is preserved under isomorphism, we may assume 
that W = P @Q is free. Let 7: W — P be the projection defined by (p+ q) =p 
for p € Pandg € Q, and let {w; | i € I} be a basis of W. Given a as in the diagram, 
we must construct y: P — M such that ay =~. 

Note that @7(w,;) € N for each ie J. Since a: M — N is onto, there exists 
m; € M such that 6r(w;) = a(m,). But the w; are a basis of W, so there exists 
A:W — M such that \(w;) = m; for each i. Hence, 


M 


ar(w;) = a(m;) = Br(w;), for each 7. 


It follows that a\ = Ba because the w; generate W. Finally, let y: P — M be the 
restriction of \ to P, that is, y(p) = X(p) for all p € P. Compute 


ary(p) = ad(p) = Br(p) = B(p), for allpe P, 
because 7(p) = p. Hence, ay = f, as required. 
(3)=-(4). Take N = P and 6 = 1p in the diagram. 
(4)=(1). This is Lemma 8. 0 


Example 2. A module rP is a principal projective module if and only if P & Re 
for some e? =e € R. 


Solution. If P = Ra is projective, define a: R— Re by a(r) =rza for all re R. 
Then a is onto so RR = kera ® L for some left ideal L. Hence P= R/kera & L, 
But direct summands of rR have the form Re for some idempotent e by Example 
6 §7.1, so we have P & Re for some e? =e E R. 

Conversely, if e? =e then Re® R(i—e)=R is free, so Re is projective by 
Theorem 3. CO 


Example 3. There exist projective modules that are not free. 
Solution, Let F be a field, and let R = F x F. Then e = (1,0) is an idempotent in 


R, so Re = {(a,0) | a € R} is projective. But dimp(Re) = 1 so Re is not free since 
any free R-module has F-dimension at least 2 (because dim -R = 2). | 


11.2. The Wedderburn—Artin Theorem 465 


The Wedderburn—Artin Theorem 


Let A be a left ideal of a ring R. If X is a nonempty subset of a module pM, define 
AX to be the set of all finite sums of elements ax where a € A and x € X. This is 
a submodule of M, and it is easy to verify that 


(A+B)X =AX+BX and A(BX) =(AB)X 


hold for all left ideals A and B. Note that RX = X if X is a submodule of M. 
Taking M = R in the above discussion shows that multiplication of left ideals is 
associative: A(BC) = (AB)C for any left ideals A, B, and C. 

An ideal A is called nilpotent if A” = 0 for some n > 1; equivalently if any 
product of n elements of A is zero. The next result, proved in 1942 by Richard 
Brauer, characterizes the non-nilpotent, simple left ideals. 


Lemma 9. Brauer’s Lemma. Let K be a simple left ideal of a ring R. Then 
either K? = 0 or K = Re for some nonzero idempotent e? =e € K. 


Proof. Assume that K? # 0, so that Ka #0 for some 0 #a€ K. Since Ka C K is 
a left ideal, it follows that Ka = K because rK is simple. In particular ea = a for 
some 0 # e € K. Hence e?a = ea, so e? -e € B= {bE K | ba = 0}. Moreover, B is 
a left ideal and B + K because Ka # 0, so B = 0, again by simplicity. This means 
that e? = e, and so Re C K because e € K. But e #0 so Re = K by a third appeal 
to the simplicity of pK. This is what we wanted. a 


We now turn to another property of rings that plays a prominent role in the 
Wedderburn-Artin theorem. A ring R is called semiprime if it satisfies the follow- 
ing equivalent conditions (the routine verifications are left to the reader): 


(1) If A? =0, A a left (or right, or two-sided) ideal, then A = 0. 

(2) If A" =0,n>1, A a left (or right, or two-sided) ideal, then A = 0. 

(3) IfaRa =0 where a € R, then a=0. 
Hence, every ring with no nonzero nilpotent elements is semiprime, and the converse 
is true for commutative rings. A product R= R, x Ro x--+x Ry of rings is semiprime 
if and only if each R,; is semiprime (using condition (3)). A matrix ring M,(R) is 
semiprime if and only if R is semiprime because the ideals of M,,(R) all have the 
form M,,(A) for some ideal A of R by Lemma 3 §3.3. 

The next theorem gives several useful characterizations of when a ring R is 
semisimple as a left module over itself. 


Theorem 4. If R is a ring the following conditions are equivalent: 
(1) RR is semisimple 
(2) Every left R-module is semisimple. 
(3) Every left R-module is projective. 
(4) R is left artinian and semiprime. 


Proof. (1) => (2). Each module pM is an image of a free module, and free modules 
are semisimple by (1). So (2) follows from Corollary 1 of Theorem 1. 

(2) => (3). Given a module M, let a: N — M be an onto R-linear map. Then 
kera is a summand of N because N is semisimple by (2). This proves (3). 

(3) => (1). Let L be any left ideal of R, and consider the coset map y: R > R/L. 
Then L = kery so L is a direct summand of rR because R/L is projective by (3). 
Hence z FR is complemented, and so is semisimple by Theorem 1. 


466 11. Finiteness Conditions for Rings and Modules 


(1) = (4). First, R is left artinian by Lemma 4 because RR is finitely generated. 
Suppose A is an ideal of R with A? = 0. By Theorem 1 R= AOL, L a left ideal, 
so ALC ANL=0. Hence, A= AR= A? + AL =04+0=0, so R is semiprime. 

(4) > (1). If zR is not semisimple, let L be minimal among nonzero left ideals 
of R that are not semisimple (by (4)). Since L #0, let K C L be a simple left 
ideal, again by (4). Then K* #0 because R is semiprime, so K = Re for some 
0 # e? =e by Brauer’s lemma. Since R= K @ R(1—e) and K C L, we obtain 


L=K@®(LNR(1—e)) by the modular law (Theorem 2 §7.1). 


Hence, there are two cases: either LM R(1 — e) = 0 (in which case L = K is simple) 
or LM R(1 — e) is semisimple (by the minimality of L). Either way, DL is semisimple, 
a contradiction. So rR is semisimple. i 


Note that we cannot replace “left artinian” by “left noetherian” in (4) of Theorem 4. 
In fact the ring Z of integers is noetherian and semiprime, but it is not semisimple. 

The first, and possibly the most important application of the theory of semisim- 
ple modules is to prove the following fundamental theorem: If R is a ring then RR 
is semisimple if and only of R is a finite direct product of matrix rings over division 
rings. This result was first proved in 1908 by Wedderburn for finite dimensional 
algebras. Then in 1927, Artin replaced the finite dimensional hypothesis by the 
descending chain condition on left ideals. 


Theorem 5. Wedderburn—Artin Theorem. The following conditions are equiv- 
alent for aring R: 


(1) p& is semisimple. 

(2) Rr is semisimple. 

(3) R& Mn, (D1) X Mn, (D2) x +++ X Mn, (Dx) for division rings D;. 
Moreover, the integers k, ni, ...,n% in (3) are uniquely determined by R as are 
the division rings D; up to isomorphism. 


Proof. We need prove only (1) © (3) by the right—left symmetry of (3). 
(1)=(8). Let Hy, Ho,...,Hm be the homogeneous components of rR. Hence 


R=H,@H.@:-:-O@Hm 


by Theorem 2. Moreover, each H; is an ideal of R (being fully invariant), so 
RH, x H2x:+:-x Hm as rings by Theorem 7 §3.4 (and induction). So, by 
Wedderburn’s theorem, it remains to show that H; is left artinian and simple for 
each i. Write H = H; for convenience. The ring R is left artinian (by Lemma 4 
since it is semisimple), so the same is true of pH. But the left ideals of the ring H 
are exactly the left ideals of R that happen to be contained in H (verify). Hence 
HA is artinian. Finally, if A ~ 0 is an ideal of the ring H, then A is an ideal of R 
because H = Re where e? = e is central in R (verify). Hence, A is fully invariant in 
rR. But then Theorem 2 shows that A = X{H; | AN H; # 0}, and it follows that 
A= dH. Thus, the ring H is left artinian and simple, and (3) follows by Wedderburn’s 
theorem. 

(3)=>(1). If D is a division ring, the ring M,(D) = K, @ Ko ®--:® Ky where 
kK; is the left ideal of all matrices with only column j nonzero. It is a routine 
verification (see Example 1) that each K; is a simple left ideal of M,(D), so M,(D) 
is semisimple. Hence, (1) follows from (3). 


11,2. The Wedderburn—Artin Theorem 467 


Uniqueness. Suppose R = Mm, (Bi) x Mm,(Bo) x +++ X Mm,(Bi) is another such 
decomposition where each B; is a division ring. Lemma 10 below shows that k = 1 
and, after relabeling, that M,,(D;) = Mm,(B;) for each i = 1,2,--- ,&. We can now 
apply the uniqueness in Wedderburn’s theorem (Theorem 3 §11.1). | 


The rings in Theorem 5 are called semisimple rings. The next result shows that 
this property is inherited by several related rings—the proofs are Exercises 10, 11, 
and 12. 


Corollary. If R is a semisimple ring, so also is every matrix ring M,,(R), every 
factor ring R/A, and every corner ring eRe where e? = e. 


Note that subrings of semisimple rings need not be semisimple (consider Z C Q). 


Lemma 10. Let R, x Ro X--+ x Ry & Si x Sy x -++ x S) as rings, where each R; 
and each S; is a simple ring with a left composition series. Then k = 1 and, after 
relabeling, R; = 5; for each i. 


Proof. Write R= II%_,R; and S = T5_155. We may assume that k <1. Let 
o:R-—S be a ring isomorphism. The ideals of S are all of the form IIA;, 
where A; is an ideal of S; for each j (Exercise 4). Hence, since o(R;) is a simple 
ideal of ILS;, it must be one of the S;; by relabeling assume that o(R1) = S41. 
Similarly o(R2) = S; for some j, and j #1 because o(R,) # o(Re), o is one-to- 
one, and RM Rz=0. So, after relabeling, let o(R2) = Sz. Continue to obtain 
S=o0(Ri) x-++ x o(Ry) X Sega X +++ x Sy). Hence, S& RX (Sp41 X +++ x Sp) 80, 
since S and R have the same left composition length, the Jordan—Hélder theorem 
(Theorem 1 §11.1) shows that S,41 x--- x S; has length zero. Hence, k =1 and 
5S; =o0(R;) = R; for each 7. This completes the proof. | 


Wedderburn’s 1908 version of Theorem 5 was a breakthrough.!*® To quote Emil 
Artin, “This extraordinary result has excited the fantasy of every algebraist and 
still does so to this day. Very great efforts have been directed toward a deeper 
understanding of its meaning.” /8° When the Wedderburn-Artin theorem appeared 
in 1927 it was a landmark in algebra. It influenced a generation of ring theorists 
and has inspired many generalizations. 

Wedderburn’s theorem asserts that a left artinian simple ring is a matrix ring 
over a division ring. In 1945, Nathan Jacobson, and independently Claude Chevalley, 
extended Wedderburn’s theorem by dropping the artinian hypothesis and showing 
that a simple ring with a simple left ideal must be isomorphic to a “dense” subring 
of the ring of endomorphisms of a vector space over a division ring. This result is 
called the density theorem. 

Also in 1945, Jacobson showed that the intersection J of all the maximal left 
ideals of R always equals the intersection of the maximal right ideals of R. He then 
proved that J is the largest ideal with the property that 1+ ais a unit for alla € J, 
extending work of Sam Perlis done in 1942 for finite dimensional algebras over a 
field. The ideal J is called the Jacobson radical of the ring R. It is known that, 


135 Maclagan Wedderburn, J.H. On hypercomplex numbers, Proceeding of the London Mathemat- 
ical Society, Series 2, 6 (1908), 77-118. 

4136 Artiin, E. The influence of J.H.M Wedderburn on the development of modern algebra, Bulletin 
of the American Mathematical Society 56 (1950), 65-72. 


468 11. Finiteness Conditions for Rings and Modules 


if R is left (or right) artinian, the factor ring R/J is semisimple, and idempotents 
can be lifted modulo J in the sense that if a? —a€ J then there exists e? =e 
such that a —e € J. In 1960 this led Hyman Bass to carry the theory further. He 
called a ring R semiperfect if R/J is semisimple and idempotents can be lifted 
modulo J, and he showed that many properties of left artinian rings carry over to 
these semiperfect rings.'°” 

Finally, the Wedderburn—Artin theorem shows that a left artinian semiprime 
ring is semisimple, and another natural question is what happens if we replace 
the left artinian condition by the requirement that the ring be left noetherian. In 
1960, Alfred Goldie proved a fundamental structure theorem for the semiprime left 
noetherian rings. There is a way of embedding certain rings into a ring of left 
quotients, a noncommutative version of the construction of the field of quotients 
of an integral domain. Goldie showed that every left noetherian, semiprime ring has 
a semisimple ring of left quotients. 


Exercises 11.2 


1. Describe the semisimple Z-modules, and the homogeneous ones. 

2. Let R be a ring and let a € R. 

(a) If R is a semisimple ring and L is a left ideal, show that L = Re for some 
e’= e (so L is principal). [Hint: R is complemented.] 

(b) In general, if Ra = Re with e?=e, show that aR = fR for some f?= f. [Hint: 
Show that ata = a where e = ta, and use f = at] 

3. Let Mp be aright R-module, and write EF = end(M,). 

(a) Show that M is a left E-module via a- z= a(s) for allac Eandze M. 

(b) Show that »M is simple if and only if the only fully invariant submodules of 
MrareQand M. , 

4. If A is an ideal of a product R = RixR2x---x Ry, of rings, show that A has the 
form A = A,xA2x-:: xX An, where A; is an ideal of R; for each i. 

‘5. If Ny,...,N,, are all maximal submodules of a module M, show that M/(NN,) is 
semisimple. 

6. Let RM = H,® H2®@::-@® Hy where each H; is fully invariant in M. Show that 
end M = end(H,)x end(H,) x --- x end(H,,) as rings. 

7. If M is a finitely generated, semisimple module, show that end M is a semisimple 
ring. [Hint: Preceding exercise and Lemma 8 §7.1.] 

8. Let K be a simple module. If M is any module define Hy,(K) to be the sum of 
all submodules of M isomorphic to K, where Hy(K) =0 if M has no submodule 
isomorphic to K. If a: M — N is an R-linear map, show that a[H,,(K)|] C Hy(K). 

9. Show that every domain with a simple left ideal is a division ring. [Hint: Brauer’s 
lemma.| 

10. If R is semisimple show that M,,(R) is semisimple for all n > 1. [Hint: Theorem 4] 

11. If R is semisimple show that R/A is a semisimple ring for all ideals A of R. 

12. If R is semisimple show that eRe is semisimple if e?= e € R. [Hint: Theorem 4] 

13. If R is semiprime and e?= e € R, show that the following are equivalent: (1) Re is 
a simple left ideal; (2) eRe is a division ring; (3) eR is a simple right ideal. [Hint: 
Lemma 4 §11.1,] 


187Yes, there are perfect rings, indeed left and right perfect rings. They were also introduced by 
Bass, but a discussion of these rings is beyond the scope of this book. 


14. 


15. 
16. 


17. 


18. 
19. 


20. 


21. 
22. 


11.2. The Wedderburn—Artin Theorem 469 


ies E ar ie Goadeee= [: at Show hat ee we a field; Wat Reis 


not simple (so the converse of Brauer’s lemma is false). Show that eR is simple. 

If rR is semisimple, show that R is right and left noetherian. [Hint: Lemma 4.] 

Let R be a semiprime ring. 

(a) If L and M are left ideals, show that LM =0 if and only if ML=0. 

(b) If A and B are ideals show that AB = 0 if and only if ANB =0. 

(c) If A is an ideal, and r € R, show that rA = 0 if and only if Ar = 0. 

If R= Rix R2x--- x Ry, where the R; are rings. Show that R is semiprime if and 

only if each R; is semiprime. 

If R is semiprime show that eRe is also semiprime for any e?=e € R. 

Show that a ring R is semiprime if and only if M,(R) is semiprime for some (any) 

n>1. 

A ring R is called prime if AB = 0, A, B ideals, implies that A = 0 or B = 0. 

(a) Show that the commutative prime rings are the integral domains. 

(b) Show that a ring R is a prime and left artinian if and only if R= M,(D) for 
some n > 1 and some division ring D. 

If P\, Po,...,P,, are projective modules, show that P,}\@P2®--- @ Pp is projective. 

If M is a left module, define the socle of M, denoted soc(M), to be the sum of all 

the simple submodules of M. (Take soc(M) = 0 if M contains no simple submodule). 

Show that 

(a) soc(M) is fully invariant in M. 

(b) If N C M is a submodule then soc(N) = N M soc(M). 

(c) If M = .N,@ N, then soc(M) = soc(N,)@ soc(N,). 


Appendices 


APPENDIX A COMPLEX NUMBERS 


The set R of real numbers has deficiencies. For example, the equation 27 + 1=0 
has no real root; that is, no real number u exists such that u? + 1 = 0. This type of 
problem also exists for the set N of natural numbers. It contains no solution of the 
equation «+1 = 0, and the set Z of integers was invented to solve such equations. 
But Z is also inadequate (for example, 2x — 1 = 0 has no root in Z), and hence the 
set Q of rational numbers was invented. Again, Q contains no solution to x? — 2 = 0, 
so the set R of real numbers was created. Similarly, the set C of complex numbers 
was invented that contains a root of x? +1 = 0. More precisely, there is a complex 
number 7 such that 


2 =-1., 


However, the process ends here. The complex numbers have the property that every 
nonconstant polynomial with complex coefficients has a (complex) root. In 1799, 
at the age of 22, Carl Friedrich Gauss first proved this result, which is known as 
the Fundamental Theorem of Algebra. We give a proof in Section 6.6. 

In this appendix, we describe the set C of complex numbers. The set of real 
numbers is usually identified with the set of all points on a straight line. Similarly, 
the complex numbers are identified with the points in the euclidean plane by 
labeling the point with cartesian coordinates (a, b) as 


(a, b) = a+ bi. 
Then the set C of complex numbers is defined by 
C = {a+ bi | a and b in R}. 


When this is done, the resulting Euclidean plane is called the complex plane. 
Each real number a is identified with the point a = a + 02 = (a,0) on the z-axis 
in the usual way, and for this reason the z-axis is called the real axis. The points 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


472 Appendices 


bi=0+ bi=(0,b) on the y-axis are called imaginary 


numbers, and the y-axis is called the imaginary _, ,9,, . 03424 

axis.188 The diagram shows the complex plane and sev- jel4i 

eral complex numbers. ie ae eee 
Identification of the complex number a + bi = (a, 6) Ps 


with the ordered pair (a,b) immediately gives the fol- 
lowing condition for equality: 
Equality Principle. a = bi = a! + i if and only ifa=a' andb=¥'. 

For a complex number z = a+ bi, the real numbers a and 0 are called the real 
part of z and the imaginary part of z, respectively, and are denoted by a = rez 
and b=imz. Hence, the equality principle becomes as follows: Two complex 
numbers are equal if and only if their real parts are equal and their imaginary 
parts are equal. 

With the requirement that 7? = —1, we define addition and multiplication of 
complex numbers as follows: 


(a+ bi) + (a+ 0%) = (ata) 4+ (b+0)i, 
(a + bi)(a’ + Bi) = (aa’ — bb’) + (ab! + ba’)i 
These operations are analogous to those for linear polynomials a+ bx, with one 
difference: 1? = —1. These definitions imply that complex numbers satisfy all the 


arithmetic axioms enjoyed by real numbers. Hence, they may be manipulated in the 
obvious fashion, except that we replace i? by —1 whenever it occurs. 


Example 1. If z= 2-31 and w =—1+4%, 


ztw=(2—1)4+(-34+]i=1-2i, 
z—-w=(2+1)+(-3-1li=3-4i, 
zw = (-2— 377) + (24+ 3)i= 14 54, 
gz = 5(2—34) = 2-3, 
z* = (2? + 947) + 2(-6)¢ = —5 — 123. 


Ezample 2. Find all complex numbers z such that z? = —i. 


Solution. Write z = a+ bi, where a and 6b are to be determined. Then the condition * 
z? = —i becomes 
(a? — b?) + 2abi = 04+ (-1)i. 


Equating real and imaginary parts gives a” = b? and 2ab = —1. The solution is 
1 


1 Saree ee ee 
ae sy) so 2-4 (- pi) = 450 yi 


Theorem 1 collects the basic properties of addition and multiplication of complex 
numbers. The verifications are straightforward and left to the reader. 


im) 


188 As the terms complex and imaginary suggest, these numbers met with some resistance when 
they were first introduced. The names are misleading: These numbers are no more complex than 
the real numbers, and i is no more imaginary than —1. Descartes introduced the term imaginary 
numbers, 


Appendiz A Complex Numbers 473 


Theorem 1. If z, u, and w are complex numbers, then 
(1) z+w=we+z and zw=wz. 
(2) z+ (u+w) = (z+) +w and z(uw) = (zu)w. 
(8) z+0=2 andz-l=z., 
(4) z(u+w) = zut zw. 


The following two notions are indispensable when working with complex num- 
bers. If z = a+ bi is a complex number, the conjugate Z and the absolute value 
(or modulus) |z| are defined by 


Z-a-—bi and |z| = Va? + b2. 


Thus, Z is a complex number and is the reflec- 
tion of z in the real axis (see the diagram), 
whereas |z| is a nonnegative real number and 
equals the distance between z and the origin. 
Note that the absolute value of a real number 
a=a+ 01 is |al| = Va? + 0? = Vo? using the 
definition of absolute value for complex num- 
bers, which agrees with the absolute value of 
a regarded as a real number. 


Theorem 2. Let z and w denote complex numbers. Then 
(1) ztkw= Zt 


(3) @) =z 
(4) z is real if and only if Z = z 

(5) 22 = |z/? 

(6) |z| > 0 and |z| = 0 if and only if z=0 

(7) |zw| = |2{|e 

Proof. (1) We prove (2), (5), and (7) and leave the rest to the reader. If z= a+ bi 
and w =c+di, we compute 


ZW = (a — bi)(c— dt) = (ac — bd) — (ad + be)i, 
zw = (a+ bi)(c+ di) = (ac — bd) + (ad + bc)i = (ac — bd) — (ad + be)i, 


which proves (2). Next, (5) follows from 
zz = (a+ bi)(a — bi) = (a? +b?) + (—ab 4 ba)i = a? +b = |z|?. 
Finally (2) and (5) give 
|zw|? = (zw) (Zw) = zw2W = zZww = |z|?|wI?. 
Then (7) follows when we take positive square roots. | 
Let z be a nonzero complex number. Then (6) of Theorem 2 shows that |z| # 0, 
and so z (ce?) = 1 by (5). As a result, we call the complex number (1/|z|?)Z the 


|z|? 


inverse of z and denote it z~! = 1/z, which proves Theorem 3. 


474 Appendices 


Theorem 3. If z=a+i is a nonzero complex number, then z has an inverse 


given by 
gis iy a ae oe peearee i 
~ zl?" \ a2 +0? a? +2} © 


Hence, for real numbers, dividing by any nonzero complex number is possible. 
Example 3 shows how division is done in practice. 


342i 
245%. 


Example 3. Express in the form a + bi. 


Solution. We multiply the numerator and denominator by the conjugate 2 — 57 of 
the denominator: 

34+2i  (34+21)(2—5¢) (6+10)+(4—15)¢ 16 11. 

2+5i (24+ 5i)(2 — 5a) 22 + 52 ~ 99 29" 

The addition of complex numbers 
has a geometric description. The dia- 
gram shows plots of the complex numbers 
zg=a+bhi and w=c+di and their sum 
z+w=(at+ce)+(b+d)i. These points, 
together with the origin, form the vertices of 
a parallelogram, so we can find the sum z + w 
geometrically by completing the parallelogram. This method is the Parallelogram 
Law of Complex Addition and is a special case of vector addition, as students 
of linear algebra will recognize. 

The geometric description of complex multiplication requires that complex num- 
bers be represented in polar coordinates. The circle with its center at the origin 
and radius 1 shown in the diagram below is called the unit circle. An angle 
§ measured counterclockwise from the real axis is said to be in standard po- 
sition. The angle 6 determines a unique point P on this circle. The radian 
measure of @ is defined to be the length of the arc from 1 to P. Hence, the 
radian measure of a right angle is 7/2 radians and that of a full circle is 27 
radians. We define the cosine and sine 


of @ (written cos@ and sin@) to be the x y |, cosd + isind 

and y coordinates of P. Hence, P is the : | Radian 

point (cos @,sin@) =cos@+isin@ in the ie oes 
16) 


complex plane. These complex numbers 
cos 6+7sin 8 on the unit circle are denoted 


=], 


e” = cos + isind. 
A complete discussion of why we use this notation lies outside the scope of this 
book.19 

The fact that e” is actually an exponential function of 0 is confirmed by verifying 


that the law of exponents holds, that is, 
eet? — elt") for any angles @ and y. 
189 An entire theory exists for the study of functions such as e*, sinz, and cos z, where z is a complex 


variable. Many theorems can be proved in this theory, including the Fundamental Theorem of 
Algebra mentioned previously. 


Appendizt A Complex Numbers 475 


This law is analogous to the exponent rule e%e? = e+? for real exponents a and b, 


and it is an immediate consequence of the identities for sin(@ +) and cos(@ + »): 
eet” = (cos@ + isin 6)(cosy + isin y) 
= (cos # cos y — sin @ sin y) + i(cos 6 sin y + sin @ cos y) 


cos(6 + y) + isin(@ + ¢) 
= ei(O+9) 


We can now describe complex multiplication 
geometrically. We let z = a + bi be any complex 
number. The distance r from z to 0 is the 
modulus r=|z|. If 240, it determines an 
angle 6, as shown in the diagram, called an 
argument of z. This angle is not unique (9+27k 
would do as well for any k = 0,+1,+2,...,) but, as the diagram clearly shows, 


a=rcosé and b=rsin@ 


always hold. Hence, in any case 

z=r(cos@+isin) = re’, 
This expression is the polar form of the complex number z. The geometric 
description of complex multiplication follows from the law of exponents. 


Theorem 4. Multiplication Rule. If z=re™ and w= se’? are two complex 
numbers in polar form, then 


zw = rsei(@te), 
In other words, to multiply two complex numbers, simply multiply the absolute 


values and add the arguments. This method simplifies calculations and is valid for 
any arguments @ and y. 


Example 4. Multiply (1 — 7)(1 + V37) by first converting the factors to polar form. 


Solution. The polar forms (see the 14 V5 
diagram) are 
1-i=V2e"/4 
and 


1 + -f3i = 2e?/S, 
Hence, the multiplication rule 
gives 
(1—4)(1 + V3) = 2V2e("/3-7/4) 
= aa jgntte 
= 2/2(cos 1/12 + isin 1/12). 


476 Appendices 


Of course, direct multiplication gives (1 —i)(1 + V3i) = (V3 +1) + (V3 — 1)i, so 
equating real and imaginary parts gives the (somewhat unexpected) formulas 


cos (=) = ee and sin (=) = oo O 
12 2/2 12 9/2 ° 
If z = re®® is given in polar form, z? = r?e" by the multiplication rule. Hence, 
23 = (re) (e749) = r3e%?, In general, we have Theorem 5 for any n>1 (we 
leave the proof for n < 0 as Exercise 15(b)). The name honors Abraham DeMoivre 
(1667-1754). 


2 


Theorem 5. DeMoivre’s Theorem. If 6 is any angle and r > 0, then (re“®)" = 


re” for all integers n. 
Example 5. Verify that (—1 + V32)° = 8. 
The polar form is —1 + /3i = 2e2"*/3, Hence DeMoivre’s theorem gives 
(—1 + V3i)3 = (2e?7#/8)3 — 23¢27t — 93.1 = 8, Oo 


If n >1, a complex number w is called an nth root of unity if u” =1. 
DeMoivre’s theorem gives a way to find all possibilities (there are n). If we write 
u = re” in polar form and use DeMoivre’s theorem, the condition u” = 1 becomes 


rein = 16%, 


Comparing absolute values gives r” = 1, so r = 1 (because r is real and positive). 
However, the arguments may differ by integral multiples of 27, so all we can conclude 
is that nO = 2km, where k is an integer; that is, 

_ 2nk 


0 ' k an integer. 
n 


These arguments give distinct val- 
ues of u on the unit circle for k= 
0,1,2,...,2—1, as shown in the di- 
agram. But every choice of k yields a 
value of @ differing from one of these 
by a multiple of 27, so they give all 
the possible roots. This proves Theo- 
rem 6. 


Theorem 6. The nth roots of unity are wp = e?**/" for k =0,1,2,...,n—1. 


We find these roots geometrically as the distinct points on the unit circle, start- 
ing at 1, that cut the circle into n equal sectors. Note that if n = 2, the roots are 
1 and —1, whereas the four 4*" roots of unity are 1,i,—1, and —i. In general, if we 
write w = e?7/" then the n*® then w* = e?**#/" for each k > 1, so the nth roots of 
unity are just the powers of w : 


1,w,w?,...,w™!, where w = e27/", 


For this reason, w = e?"*/” is called a primitive nth root of unity. It follows easily 
(Exercise 16) that 1+ w+w?+---+w"! =0, that is, the sum of the nth roots 
of unity is zero. 


Appendiz A Complex Numbers ATT 


Exercises A 


1. 


Oo ON D 


10. 


11. 


12. 


13. 


14. 


15. 


Solve each equation for the real number z. 
(a)  — 4i = (2-7)? 

(b) (2+ x2)(3 — 21) = 124 54 

(c) (2+ ai)? =4 

(d) (24+ x1)(2—at) =5 


. Convert each expression to the form a + bi. 


(a) Ca a aes (b) aay Os hea 
() 9-3; + aya (Da Be 
(e) i (f) (2—4)° 
(g) (1+2)* (h) (1—4)?(2 +4)? 

. In each case, find the complex number z. 
(a) iz -(1 +4)? =3-7 (b) (6+ 2) — 84(2—z) =iz+1 
(c) 22 = -i (d) 227=3-4i 


(e) 22+(1— 74) =(144)z 


. Let rez and im z denote the real and imaginary parts of z. Show that 


(a) im(iz) =rez (b) re(iz) = —imz 
(c) z+ 2 = 2rez (d) z—Z=2iimz 
(e) re(z + w) =rez+rew,and re(tz) = t- rez if t is real 
(f) im(z + w) = imz+imw, and im(tz) = t-imz if ¢ is real 


. In each case, describe the graph of the equation, where z denotes a complex number 


(a) lz|=1  (b) |z—-1] =2 
(c)z=iZ (d)z=-2 
(e) z= |z| (f) imz=m-rez, ma real number 


. Verify |zw| = |z|-|w| directly for z= a+ bi and w=c+di. 

. Prove that |w + z2|? = |w|? + |z|? + w+ wz for all complex numbers w and z. 

. Show that (1+ 7)" + (1 —2)” is real for all integers n > 1. 

. (a) Complex Distance Formula. Show that |z —w| is the distance between the 


complex numbers z and w. 

(b) Triangle Inequality. Show that |z+ w| <|z|+|w| for all complex numbers z 
and w. [Hint: Consider the triangle with vertices 0,w, and z+ w.] 

Write each expression in polar form. 


(a) 3-36 (b) —4i (c) -V3 +i 
(d) -4+ 4V3% (e) —7i (£) —6 + 6% 
Write each expression in the form a + bi. 

(a) B3e™ (b) etni/3 (c) Qe3rt/4 

(d) Jfde7mi/4 (e) ebti/4 (f) 2./3e7277/6 
Write each expression in the form a + bi. 

(a) (-1 + Vi)? (b) (1+ V32)-4 

(c) A+#® (d) (1-4)? 

(e) (1-4)°(v3 +4)? (£) (V3 —4)9(2 = 24)° 


Use DeMoivre’s theorem to show that 

(a) cos 26 = cos? 6— sin? 6; sin 26 = 2cos@sin0 

(b) cos 30 = cos* 6 — 3cos 6 sin” 6; sin 36 = 3cos? @ sin 6 — sin® 6 
Find all complex numbers such that / 

(a) z#=-1 (b) 24 = 2(V/3¢ - 1) 

(c) 23 = —27% (d) 26 = —64 

Let z = re® in polar form. 

(a) Show that z= re~ and z-1 = te". 

(b) Prove DeMoivre’s theorem for n < 0. 


478 Appendices 


16. Show that the sum of the nth roots of unity is 0. 
[Hint (1—w") =(1—w)(1+wtw7t--+w™})] 

17. (a) Suppose that 21, 22, 23, 24, and zs are equally spaced around the unit circle. Show 
that 21 + 22 +23 + 24 +25 =0. [Hint: (l—z)(1+2+2?4+224 24) =1-—2° for any 
complex number z.| 
(b) Repeat (a) for any n > 2 points placed equally around the unit circle. 

18. If z= a+ bi, show that |a| + |b| < V2-|z|. [Hint: (\a] — |b|)? > 0.] 

19. Let f(z) = ap +a,2 + agz? +++++an2” be a polynomial with real coefficients a;. 
If z is a complex root of f(x), that is, f(z) = 0, show that Z is also a root. 

20. If f(z) is a polynomial with complex coefficients, let f(x) be the polynomial obtained 
from f(x) by taking the conjugate of every coefficient. Show that f(x) f(a) is a 
polynomial with real coefficients. 

21. Let z #0 be a complex number. If t is real, describe tz geometrically if 
(a)t>0 (b) t <0 

22. If z and w are nonzero complex numbers, show that |z + w| = |z| + |w| if and only if 
one is a positive real multiple of the other. [Hint: Consider the parallelogram with 
vertices 0, w, z, and z + w. Use Exercise 21 and the fact that, if t is real, |1 + ¢/=1++ || 
is impossible if t < 0.] 

23. If a and b are rational numbers, let p and g denote numbers of the form a+ bV2. If 
p=a+ by2, define j = a — by2 and [p] = a? — 2b?. Show that each of the following 
expressions holds. 

(a) a+ bV2 =a, + by V2 only if a = a; and b= by. 
(b) piq=p+4 


(c) pq = bg 
(d) [p] = pp 
(e) [pq] = [pl[a] 


(f) If f(x) is a polynomial with rational coefficients and p= a+ bvV2 is a root of 
f(x), then ® is also a root of f(z). 


APPENDIX B- MATRIX ALGEBRA 


Matrix algebra will be familiar to most readers as it is standard fare in beginning 
linear algebra courses. The new ingredient here is that the matrices we consider 
will have entries drawn from an arbitrary commutative ring R, rather than from the 
real numbers R. A ring is an algebraic system in which there are two operations, 
addition and multiplication, for which the usual laws of arithmetic are valid (see 
Section 3.1). A ring R is called commutative if ab = ba for all a,b € R. Examples 
include the familiar number systems Z, Q, R, and C. It is worth noting that the set 
Mo2(R) of all 2 x 2 matrices over R is a ring that is not commutative. 

In this appendix, the standard results from linear algebra about matrices, adju- 
gates, inverses, determinants, and so on, will be stated, with suitable minor modifi- 
cations, over an arbitrary commutative ring R. We omit most proofs. When R=R 
many proofs from linear algebra remain valid, but this is often not the case. It is 
important to note that it is essential that R is commutative in many arguments, 
especially when dealing with inverses and determinants. Hence, 


R denotes a commutative ring throughout Appendix B. 


Appendiz B Matrix Algebra 479 


Matrix Algebra 


A rectangular array of elements of R is called a matrix and the elements themselves 
are called the entries of the matrix. Thus, 


1 -1 12 -1 : 

A=[, 2] B=|) 5 a o=|-] 

are matrices over Z. The shape of a matrix depends on the number of rows and 

columns, and an m x n matrix is one with m rows and n columns. Two matrices 

are the same size if they have the same number of rows and the same number of 

columns. Thus, the preceding matrices A, B, and C are of size 2x 2,23, and 3x1, 
respectively. An n x n matrix is called a square matrix. 

The rows and columns of a matrix are numbered from the top down and from 

left to right, respectively. Then the entry in row 7 and column j of a matrix A is 

called the (i,7)-entry of A. If the (2, j)-entry is denoted a;;, then A has the form 


@11 @12 Gin 

Q21 a22 aan 
A= ‘ 

Qm1 @m2 ‘"' @mn 


which usually is abbreviated as A = [a;;]. Two mxn matrices A = [a;;] and 
B = [b,;] are equal (written A = B) if they have the same size and corresponding 
entries are equal, that is 


[aij] = [bij], if and only if aij = bi; for alli and j. 


The set of all m x n matrices with entries from R is denoted My,(R). For A and 
B in Mnn(R), we obtain their sum A+B by adding corresponding entries. If 
A= [a;j] and B = (b;;], this is 


A+B= [ai + bi;]. 


This addition enjoys many of the properties of numerical addition. For example, if 
A, B, and C are in Mmn(R), then 


A+B=B+A and A+(B+C)=(A+B)+C. 


The matrix in Mmn(R) each of whose entries is zero is called the zero matrix of 
size m x n and is denoted 0 (or Omn if the size must be emphasized). Clearly, 


A+0=A, for all A in Mnn(R). 


So 0 plays the role in Mynn(R) that the number zero plays in Z. We obtain the 
negative —A of a matrix A in Mp,(R) by negating every entry of A. Hence, 


A+(-A)=0, for all Ain Mn(R). 
Finally we define subtraction by A—- B= A+ (—B). If A= [a;;] and B = [};,], 
-A= [—ais] and A-Bz= [aij = bis]. 


With these definitions, the additive arithmetic in Mmn(R) is entirely analogous to 
numerical arithmetic. 


480 Appendices 


We also use the following notation: If A is a matrix and r € R, the matrix rA 
is obtained by multiplying every entry of A by r. More formally 


If A= [2x3] then rA= [rai,]. 


This is called scalar multiplication and one verifies that it enjoys the following 
useful properties for all scalars r,s and matrices A, B: 

(1) r(A+B)=rA+rB 

(2) (r+s)A=rA+sA 

(3) r(sA) = (rs)A 

(4) lA=A 


Example 1. Given A= ie ‘| and B= [: A in Mo2(Z), we have 
24 —5B = i | = ie | = hae ol 

Example 2. If A= E Bs = 

Mo3(R) such that X + A= B. 


3 7 <1 


] and ae ta 


in Mo3(R), find X in 


Solution. We proceed as in numerical arithmetic and subtract A from both sides: 


poke ays 


Multiplication of matrices is less natural than addition. To describe it, we define 
the dot product of a row matrix and a column matrix as follows: 
by 
bg 
[ay ag +++ ax]: = a1by + agbe + +++ + Apdy. 


bp 
Now let A be an m X k matrix and B be a k x n matrix, chosen so that the rows 
of A and the columns of B have the same number k of entries. Then the product 
AB is defined to be the m x n matrix whose 


(i, j)-entry is the dot product of row 7 of A and column j of B. 


Thus to compute the (4,7)-entry, go across the ith row of A and down the jth 
column of B and form the dot product. Note: If A ism xk and B is k’ x n, then 


AB is defined only if k = k', and then the product AB is m x n. 


21 
Example 3. For A= [° zi iz and B= 0) J , compute AB and BA. 
-1 0 


Solution. We write out the dot products explicitly. 


21 
SSIs 72 640-2 3-2+0 4 1 
AB = 0 2 =| 2 pet ee 

o 14 cae 0+0-4 04+2+0 -4 2]? 


D5 64+0 -24+1 444 6 -1 8 
es a 
BA= | 0 2 5 ee 0+0 O+2 O+8/ =] 0 2 BI, Oo 
-1 0 -34+0 140 —24+0 


Appendix B Matrix Algebra 481 


Example 4. If A= & a and B= ie | , compute A?, AB, and BA. 
Solution. A= & Bi S al = F ‘| 


so A? = 0 can occur even when A # 0. Next 


Hence AB + BA is possible even though they are both the same size. O 


Example 4 shows that two familiar properties of numerical algebra fail for ma- 
trices. Hence, it is surprising to learn that the following property does hold. 


Theorem 1. Let A,B, and C be of sizes mx p, pxq, and qgxn, respectively. Then 
(AB)C = A(BC). 


Proof. Write A = [aij], B = [bij], and C = [c;]. Then AB = [x,;] where we have 
Lig = a1 Gizdej, and (AB)C = [yj] where yj; = Yh, titcrj. Hence, 


q p q Pp Pp q 
vig = Do (s abit Cty = DY DE GindetCey = Do Git (= bey) 
=1 k=1 t= 


t=1 t=1k=1 


This last expression is the (7, 7)-entry of A(BC), and the theorem follows. Note 
that we needed the associativity of R to get ain (breces) = Gindeece; = (Cinder) ce;. 


We express this result by saying that matrix multiplication is associative when the 
matrix sizes are such that the products involved are all defined. 

The number 1 plays a neutral role in numerical multiplication in the sense that 
la =a and al =a for every number a. The analogous role in matrix algebra is 
played by the identity matrices [,. For each n > 1, the matrix I, is defined to 
be the n x n matrix with 1s along the main diagonal (upper left to lower right), 
and Os elsewhere. Thus, 


1 00 0 
1 0 apes 010 0 
it = 70 2-8 cae 
Ig E ae I3 ’ I, 001 0f?° 
001 
0001 


We use J without a subscript for the identity matrix when there is no need to 
emphasize the size. The reader can verify that the relations 


AI=A and IB=B 


hold whenever the matrix products are defined. 
Note that rA = (rI)A for all r € R and all matrices A. Moreover, since F is 
commutative we have (when the matrix multiplications are defined) 


r(AB) =(rA)B=A(rB), forallre R. 


482 Appendices 


Square Matrices and Inverses 


We are interested primarily in square matrices. For convenience, we use the notation 
M,(R) = Man(R), for any n > 2. 


If A and B lie in M,(R) then A+ B and AB are both in M,,(R). Theorem 2 collects 
several properties of M,(R) for reference later. 


Theorem 2. Let A,B, and C be matrices in M,(R). Then 


(1) A+ B=B+A 

(2) (A+B)+C=A+(B+C) 
(3) At+O=A 

(4) A+ (—A) =0 
(5) (AB)C = A(BC) 
(6) AI=A=IA 
(7) A(B+C)=AB+ AC and (B+C)A=BA+CA 


Proof. The only property not discussed previously is (7), and we leave the verifica- 
tion as Exercise 12. | 


The reader may have noted that Theorem 2 shows that M,,(R) is a ring (noncom- 
mutative if n > 2). It is also noteworthy that Theorem 2 holds even if the ring R is 
not commutative. 

If A is a square matrix, a matrix B is called an inverse of A if 


BA=I and AB=I. 


If it exists, this matrix B is uniquely determined by A. For if AC = I also holds, 
then AC = AB (both equal to J) so left multiplication by B gives C = B. A square 
-matrix A is called invertible if it has an inverse, and in this case the (unique) 
inverse is denoted A~!. Note that 0 has no inverse; in fact if AB =0 for some 
B£#0 then A has no inverse. 

Write R” for the set of n x 1 column matrices with entries from R. If A € M,,(R) 
it is known that A is invertible if and only if the system AX = B of linear equations 
has a solution X for any B € R”. This holds in general because AB = I in M,,(R) 
implies BA = I (Exercise 13). On the other hand, if A € M,(R) it is known that 
A is invertible if and only if the homogeneous linear system AX = 0 has only the 
trivial solution X = 0. But this condition fails for R = Z (consider A = 21). Hence 
if A is a square matrix, we need other ways to determine when A7! exists. 

Again, it is well known that A € M,,(R) is invertible if and only if det A + 0. If 
Ris any commutative ring it turns out that the determinant of A € M,,(R) can be 
defined in such a way that A is invertible if and only if det A isa unit in R (we R 
is a unit if wv = 1 = vu for some v € R). The idea is to define det A inductively. If 
n= 1 and A = [al], this holds if we define 


det[a] = a. 


If n > 2, assume inductively that det A has been defined for all (n — 1) x (n— 1) 
matrices over R. If A is n x n write Aj; for the (n — 1) x (n—1) matrix obtained 


Appendis B Matrix Algebra 483 


from A by deleting row i and column j. Then, given 1 <7,j7 <n we define the 
(i, j)-cofactor of A by 


Cij (A) = (-1)**9 det Ai 
With this, we define det A as follows: 
det A= a4 €11(A) + Gai C21 (A) +++ Gni Cpi (A). () 


Then one shows that the following properties of determinants hold for rows: 


(a) If B is formed by multiplying a row of A by u € R, then det B = udet A. 
(b) If A contains a row of zeros then det A = 0. 
(c) If two rows of A are interchanged then det A changes sign. 


(d) If a multiple of a row of A is added to a different row, then det A is 
unchanged. 


(e) If two rows of A are identical then det A = 0. 


With this one can prove a result first given (for R = R) in 1772 by Pierre Simon de 
Laplace. 


Theorem 3. Cofactor Expansion Theorem. If A = [a;;] is an n x n matrix 
over a commutative ring R, then 


(1) det A = U}L,a;;c;;(A), for each 7 =1,2,...,n, 
(2) det A= XP_aijei3(A), for each i =1,2,...,n. 


Furthermore, (a)-(e) above hold for rows and for columns. 


We omit the details.14° The expressions in (1) and (2) of Theorem 3 are called the 
cofactor expansions along column j and row j, respectively. In words, to find the 
cofactor expansion of det A along a row or column, multiply the entries of the row 
(column) by the corresponding cofactors, and add the results. 

The cofactor expansion theorem has many important consequences, and the 
following result is necessary to describe them. Given an n x n matrix A = [a,j], the 
transpose A? of A is defined by 


A? = [bj], where bj; = a,j; for all i and j. 
This is an nxn matrix obtained from A by interchanging elements symmetric about 
the main diagonal. For example, if A = kk | then A? = bs | . The familiar 
elementary properties of the transpose hold 
(1) (A+B)? = AT 4 BT 
(2) (rA)? =r A? 
(3) (AB)? = BPAY) 


when the matrix operations are defined (even for nonsquare matrices). We omit the 
proof of the following theorem. 


140The argument when R=R appears in Section 3.6 of Nicholson, W.K. Linear Algebra with 
Applications, 7th ed., McGraw-Hill Ryerson, 2012. 


484 Appendices 


Theorem 4. Transpose Theorem. If A is a square matrix then det AT = det A. 


Surprisingly, the determinant function preserves matrix multiplication. Again 
we omit the proof. 
Theorem 5. Multiplication Theorem. If A and B aren x n matrices then 
det(AB) = det Adet B. 


Given an nx n matrix A = [a,;] over a commutative ring R, we define the 
adjugate of A (also called the classical adjoint of A) as follows: 


adj A = [c;;(A)]”. 
This is also n x n and the cofactor expansion theorem (with some ingenuity) gives 
Theorem 6. Adjugate Theorem. If A isn x n and d= det A, then 
A (adj A) = dI, = (adj A) A 

Again we omit the details. 

If it happens that d= det A is a unit in R, then (as R is commutative) the 
adjugate theorem gives 

Ald-tadj(A)] = In = [d-1adj(A))A, 

from which A is invertible and A~! = d=! on, This proves part of 


Theorem 7. Invertibility Theorem. If A is n x n then A is invertible if and 
only if det A is a unit in R. In this case det(A~') = (det A)~?. 


The rest of the proof follows from Theorem 5 (Exercise 14). 
The case of 2 x 2 matrices is simple and so arises frequently in examples. The 
relevant facts are displayed next. 


b 
d 


det A = ad — bc. Hence, if ad — be is a unit in R then A7! = (ad — bc)71 2 T 


is a convenient formula for the inverse of A. 


Example 5. If n=2 and A= & then we have adj A= & =| and 


a 


These results are enough to do most of the calculations in this book. A good 
reference source for this material (and much more) is the book by McDonald, B.R. 
Linear Algebra over Commutative Rings, Marcel Dekker Inc., New York, 1984. 


Multilinear Approach 


Ann x n matrix A can be thought of as a row of columns in R”, and it is instructive 
to view matrices that way. Hence, we write A=[Aj,...,Ax,...,An], where Ax 
denotes column k of A. The determinant function det : M,,(R) — R has two basic 
properties from this point of view. 

First, det is a multilinear function of the columns of a matrix, that is, it is a 
linear function of column & for each k (when we fix all other columns of A). More 
precisely, if we define 


bh: Rk" +R by 6,(X) =det[Aj,...,X,...,An], for all X € R”, 
the requirement is that, for each k, 
On(7X + sY) = 7dx(X) + 86,(Y) for all r,s € R and X,Y € R”. 


Appendiz B Matriz Algebra 485 


Second, det is an alternating function of the columns of a matrix; that is if 
two distinct columns of A are interchanged to form B, then det B = — det A. 

Before proceeding, write X, = {1,2,...,n} and recall the symmetric group S;, 
(see Section 1.4) of all permutations of X,,, that is all bijections o : X, + Xp. Each 
such o is a product of transpositions, that is permutations that interchange two 
members of X,,. And a is called even or odd according as it can be expressed as a 
product of an even, respectively odd, number of transpositions (the parity theorem 
ensures that this is well defined). The sign of o is defined to be 1 or —1 according 
as o is even or odd, and is written (—1)?. 

Hence, the fact that det is alternating means that if B is obtained from A by 
a series of column transpositions, then det B = (—1)? det A where o is the corre- 
sponding column permutation. With this it can be shown that d = det is the only 
multilinear, alternating function d: M,(R) — R that satisfies d(I) = 1. Moreover, 
if we write o(k) = ok when ao € S, and k € Xp, the following characterization of 
det A can be proved. 


Theorem 8. If A = [a;;] isn x n then det A = Nees, (—1)% 0114202 °*' Anon where 
the sum ranges over all n! elements o of Sy. 


Theorem 8 leads to other important properties of the determinant, and is often 
taken as the definition. 


Exercises B 


Throughout these exercises, R is assumed to be a commutative ring. 


1. If A is square, AB = 0, and B #0, show that A cannot be invertible. 
2. (a) If B=([B,,...,B,,...,B,] where B, is column k of B, show that 
AB ={AB,,..., AB,,...,AByl. 
(b) If AX = 0 for every X € R", show that A=0. 
2 -3 
and B=11. --2| show that AB=% but BAS f, 
6 —10 


0 -5 1 
3. 1fA=(3 5 4 


a 


4. Show that A= | : 
In this case find A7?. 
. Find invertible 2 x 2 matrices A and B such that A+ B is not invertible. 
. Find a matrix X such that AX = BifA= 5 “al and B= per 


b 
‘| is invertible in M2(R) if and only if a and c are both units. 


5 
6 4 8 0 3 6) 

7. If Aand B aren x n, show that (A — B)(A+B) = A? — B? if and only if AB = BA. 
8. If A? = 0, show that J + A is invertible and find ({ + A)~? in terms of A. 

9 


~TfA= p | , show that A? = J and use this result to find A7! in terms of A. 


10. Show that A= & 


‘i it satisfies A? — 34 —1027 =0 and use this result to find A-! 


in terms of A. 

11. Let A and B denote n x n matrices. 
(a) If A and B are invertible, show that A7’ and AB also are invertible, and find 
formulas for the inverses. 
(b) If A, B and A+ B are invertible, show that A~+ + B™* is also invertible find a 
formula for the inverse. 


486 Appendices 


(c) If [+ BA is invertible show that J+ AB is invertible and find a formula for 
(1+ AB)-}, [Hint: AI + BA) = (I+ AB)A\] 

12. Prove (7) of Theorem 2. (‘These expressions are called the distributive laws.) 

13. (a) If A,B € M,,(R), show that AB = I implies that BA = I. 
(b) Show that A is invertible if and only if AX = B has a solution for every B € R”. 
[Hint: Write I = [E,,...,Ex,..., En], use (a) Exercise 2(a).] 

14. Prove Theorem 7 using Theorems 5 and 6. 

15. Let H;; denote the matrix in M,(R) with (i,7j)-entry 1 and all other entries 0. 
(a) Show that Fi, Em;= Skmliz where dxhm= 0 or 1 according ask =mork#m. 
(b) Show that Fy, t+Hoot i a Eng I, 
(c) Show that if A = [a,,] € M,,(R) then A = Oj j0:3 E43. 
Here the E,; are called matrix units in M,(R), and dm is called the Kronecker 
delta. 


APPENDIX C ZORN’S LEMMA 


The independent sets in a vector space are undeniably important, but the largest 
independent sets are the most important of all: they are the bases of the space. 
This theme that the “largest” objects of a given type are the most interesting is 
universal in mathematics, certainly in algebra. Zorn’s lemma is a set-theoretical 
principle that shows that maximal objects of various types exist and has become 
indispensable in many parts of mathematics. Clarifying what “maximal” means is 
best formulated using the concept of a partial ordering. 

A partial order on a nonempty set P is a relation'*! < on P that satisfies the 
following conditions (where x,y, and z denote elements of P) : 


Pl «<2 for all 2 € P (reflexivity). 
P2 Ifa<yandy < z, then z < z (transitivity). 
P3 Ife<yandy <a, then x = y (antisymmetry). 
A set P with a partial ordering is called a partially ordered set (poset for short). 


We say that (P,<) is a poset to assert that < is a partial ordering on the set P. 
Posets occur everywhere in mathematics; here are a few examples. 


The following are easily verified to be partial orders: 
Inclusion C on any nonempty collection of sets. 
Divisibility | on any nonempty set of positive integers. 
The usual ordering < on the set R of real numbers. 


Nonempty subsets of a poset are again posets with the same partial order. 


Other examples will occur later. 

If < isa partial order on a set P, we write x <ytomeang<yandaFy. 
An element m € P is called maximal in P if there is no element x of P such that 
m <2, or equivalently, 


lfm<«2, wherez € P, thenm=z. 


141S¢e Section 0.4. 


Appendiz C Zorn’s Lemma 487 


For example, if U is any nonempty set, let P denote the set of proper subsets X C U. 
Then the maximal members of P (under inclusion) are the sets U \ {a} omitting 
one element a € U. Here is a more algebraic example. 


Example 1. If R is a ring, the maximal ideals of R (Section 3.3) are defined to 
be the maximal members of the poset P= {A # R| A is an ideal of R} partially 
ordered by inclusion. If R is commutative, corollary 1 of Theorem 6 §3.3 shows 
that an ideal A is maximal if and only if the factor ring R/A is a field. 


Zorn’s lemma gives a condition that guarantees that maximal elements exist in 
certain posets. To state it, we need some terminology. Let (P,<) be a poset. If 
X C P is a nonempty subset, an element u € P is called an upper bound for X 
if « <u for every « € X. Note that u need not be an element of X. For example, 
in the poset (R,<) the interval X = (0,1) ={r €R|0<r<1) has no maximal 
member, but any number u > 1 is an upper bound on X. 

If we are looking for maximal elements in a poset P, choose 2; € P. If x1 is 
maximal, we are done. Otherwise, x1 < x2 for some x2 € P. If zg is maximal, 
we are finished; otherwise 21 < x2 < x3 for some zg € P. Hence, either we find 
a maximal element at some stage, or we create elements 21,2%2,23,... in P such 
that 21 <2%2<23 <--- is a strictly increasing sequence. So it is plausible that 
guaranteeing the existence of maximal elements in P will require some restriction 
on such ascending sequences from P. In 1935, Max Zorn found a condition on P that 
does this and is reasonably easy to verify in specific situations. 

Two elements x and y in a poset P are called comparable if either x < y or 
y <x, and the poset P is called a chain if any two elements are comparable. Thus, 
R is a chain with respect to the usual partial ordering. A partially ordered set P is 
said to be inductive if every chain in P has an upper bound in P. 


Theorem. Zorn’s Lemma. Every inductive partially ordered set has a maximal 
element. 


Note that to show that a poset is inductive, it is not enough to check that all 
countable’*? chains x1 < tq < 23 <-+: have upper bounds in P. For example, 
consider the set P of all countable subsets of R, partially ordered by inclusion. 
Then every countable chain in P has an upper bound in ?, namely, its union (since 
unions of countable sets are again countable). But P has no maximal member 
because such a maximal set would have to be R itself, and R is not a countable set. 

Zorn’s lemma has a wide variety of applications throughout mathematics; we 
give three examples from algebra. We begin with vector spaces. In Section 6.1, we 
proved that every finite dimensional vector space has a basis, but the proof that 
this holds in the infinite dimensional case requires Zorn’s lemma. If V is a vector 
space over a field F, let X be a nonempty (possibly infinite) subset of V. Then X 
is called independent if every finite subset of X is independent (in the sense of 
Section 6.1), and X is said to span V if every element of V is a linear combination 
of (a finite number of) elements of X, Finally, X is called a basis of V if it is 
independent and spans V. 


142A set, X is called countable if it can be enumerated by the set N of natural numbers, that is, 
if X = {xo,@1,22,£3,...}. 


488 Appendices 


Example 2. If F is a field, show that every nonzero vector space V has a basis. 


Solution. The idea is to show that any maximal independent set is a basis. Let Z 
denote the set of all independent sets in V and partially order Z by inclusion. We 
begin by using Zorn’s lemma to show that Z has maximal members. First, 7 is not 
empty since {v} is in Z for all v # 0 in V. Suppose that C = {X; | 7 € J} is a chain in 
T; if X = UierXi, we claim that X is independent. To see this, let {x1,@2,...,2n} 
be a finite subset of X. Then each xz is in some X;, so, since X; form a chain, 
{x1,%2,'-+ ,2n} C Xm for some m. Hence, {x1,22,...,2n} is independent (since 
Xm is in Z), as required. This shows that X is in Z and so is an upper bound for 
C in Z. 

Now Zorn’s lemma shows that Z has a maximal member B, and we claim that 
B is a basis of V. Since it is independent (being in Z), it remains to prove that 
span(B) = V, where span(B) consists of all linear combinations of vectors in B. 
Assume on the contrary that v ¢ span(B); we show that {v} U B is independent, 
contradicting the maximality of B. 

So let X be a finite subset of {v}UB; we must show that X is indepen- 
dent. If X C B, this follows because B eZ. If X ZB, then v € X, so write 
X = {v,%1,%2,...,0n}, v € B. Let agu+aizi + dato +++: + antn = 0, a; € F. 
Then ao =0 because otherwise v € span(B), contrary to our choice (since ag! 
exists in the field F’). Hence, aix1 + ago +-++++Gn0%n = 0, so aj =0 for i >1, 
as required. O 


It is worth noting that Example 2 is true for any division ring in place of F. In fact, 
most of the theory of vector spaces goes through for division rings. 

The next two examples of Zorn’s lemma come from ring theory. An additive 
subgroup L of a ring FR is called a left ideal if Ra C L for all a € L, where 
Ra={ra|r¢€ R}. Right ideals are defined similarly, and the ideals of R (see 
Section 3.3) are just the left and right ideals. A maximal left ideal M is defined 
to be a maximal member of {L#R | L isa left ideal}, partially ordered by inclusion. 


Example 3. If L # R is any left ideal, then DL is contained in a maximal left ideal. 


Solution. Let P={X |X is a left ideal and L C X # R}, partially ordered by 
inclusion. Then P is not empty (Z € P), so it suffices to show that P contains a 
maximal element. By Zorn’s lemma, it is enough to show that P is inductive. Hence, 
let {X;|7€ I} be a chain from P; we show that X =(J;X; is an upper bound 
for {X;}. Clearly, X is a left ideal containing L, and we claim that X # R. For if 
X = R, then 1 € X, say 1 € X» for some k € I. Since Xz is a left ideal, it follows 
that R= R1 CX, C R, so X; = R, contrary to the fact that X;, € P. So X is an 
upper bound in P on the chain {X;}, as required. im 


As our last example of the use of Zorn’s lemma, we prove a theorem that is of 
central importance in the theory of commutative rings. Let R be a commutative 
ring. An ideal P of R is called a prime ideal of R if R/P is an integral domain, 
or equivalently, ifrs € P, where r,s € R, then either r € P or s € P (see Theorem 3 
83.3). Every commutative ring has at least one prime ideal by Example 3 (since 
maximal ideals are prime in any commutative ring). 

An element a € R is said to be nilpotent if a” =0 for some n > 1, and the 
set nil(R) of all nilpotents in a commutative ring R is called the nil radical of 


Appendiz C Zorn’s Lemma 489 


R. It is easy to verify that nil(R) C P for every prime ideal P, and hence that 
nil(R) C N{P | P is a prime ideal of R}. The following example uses Zorn’s lemma 
to show that this is in fact equality. 


Example 4. If R is a commutative ring, nil(R) =M{P | P is a prime ideal of R}. 


Solution. Let a € P for every prime ideal P; by the above remarks, we must show 
that a is nilpotent. So we assume that a is not nilpotent and show (using Zorn’s 
Lemma) that a ¢ P for some prime ideal P. To this end, let 


A= {A| A is an ideal of R and a” ¢ A for every n > 1}. 


Then A is not empty as 0 € A since a is not nilpotent. Suppose {A; |i € I} is a 
chain from A, and let A = UicrA;. Then A is an ideal and if a” € A, then a” € A, 
for some k, contradicting the fact that A, € A. Hence, a” ¢ A for each n, and so 
A € A. This shows that A is inductive, so, by Zorn’s lemma, let P € A be a maximal 
member of A. Then certainly a = a! ¢ P, so it remains to show that P is a prime 
ideal. To that end, let rs € P, where r,s € R; we must show that r€ PorseP. 
Suppose, on the contrary, that r ¢ P and s ¢ P. Then 


Rr+P=({tr+p|te Rand pe P} is an ideal of R, 


and P Cc Rr+P because r ¢ P. Since P is maximal in A, it follows that Rr +P 
is not in A, and hence that a” € Rr +P for some n > 1, say a” = tyr + pi, where 
t; € Rand p; € P. Similarly, since s ¢ P, there exists m>1 such that a” =tos + pa, 
where to € R and po € P. But then 


amt” = (tir + p1) (tes + po) = (tite)rs + (tir)pe + (tes)pi + pip. 


Hence, a™*” € P because 7s, p;, and pg are all in P. This contradiction completes 
the proof. O 


The proof of Zorn’s lemma is difficult and will be omitted.14? It requires (and 
is in fact equivalent to) the Axiom of Choice, which asserts that if S is any family 
of nonempty sets, we can form a set containing one element of each of the sets in S. 
The axiom was first proposed in 1904 by Ernst Zermelo. At first glance, it seems 
self-evident, but there may be infinitely many choices of elements to make. For 
example, if S consists of all the bounded intervals from R, we can choose (say) the 
midpoint of each interval; but if S consists of all nonempty subsets of R, then it 
is not clear how to make all the choices. Bertrand Russell illustrates the point as 
follows: If a man has infinitely many pairs of shoes, and infinitely many pairs of 
socks, then he can easily choose one shoe from each pair (choose the left one, say), 
but choosing one sock from each pair requires the axiom of choice. 

Mathematician’s attitudes about the axiom of choice vary from never using it to 
making no distinction between mathematics assuming the axiom and mathematics 
not assuming it. Irving Kaplansky takes a middle position: “I try to remember to 
make a note of it when I use it, but I do not hesitate to use it.” Whatever your 
attitude, Kurt Gédel showed in 1940 that the axiom of choice is consistent with the 
other axioms of set theory; that is, it cannot be disproved using these axioms. Then 
in 1963 Paul Cohen showed that the axiom of choice is independent of the other 
axioms; that is, it cannot be proved from these axioms. 


148See Kaplansky, I. Set Theory and Metric Spaces, Boston: Allyn & Bacon, 1972, Section 3.3. 


490 Appendices 


Exercises C 


1. If M is a finitely generated module (see Section 6.1), show that every submodule 
K # M is contained in a maximal submodule N (that is, a submodule N # M such 
that if N C X, where X # M is a submodule, then N = X). 

2. Let K C M be modules (see Section 6.1). 

(a) Show that there exists a submodule N maximal such that KN N = 0. 
(b) If N is as in (a), show that (K+N)NX #0 for every submodule X #0. 
[Hint: Consider the cases X C N and X ¢ N separately.] 

3. Show that every commutative ring R contains a minimal prime ideal Q, that is a 

prime ideal Q such that, if P C Q and P is a prime ideal, then P = Q. 


APPENDIX D PROOF OF THE RECURSION THEOREM 


If A is a set, a mapping a:N-— A is called a sequence from A. Sequences 
are usually described as follows: If we write a(n) =a, for each n EN, then the 
sequence is denoted ao, a1, @2, @3,...,@n,.... The recursion theorem is concerned 
with recursively defined sequences wherein the first term ag is specified and the 
later terms are uniquely determined by the earlier ones. Such sequences are unique 
if they exist (see Theorem 3 §1.1); our task here is to prove existence. 


Theorem. Recursion Theorem. Given a set A and aé A, there is exactly one 


Sequence do, Qi, @2, 43,-..,An,-... from A that satisfies the following requirements: 
(1) ao =a. 
(2) Forn > 1, the term ap is uniquely determined by ao, a1, a2,..., and Gn-1. 


Proof. For each n > 1, let B, : A" — A be a mapping. We want to show that a 
sequence dg, @1, @2,--: exists such that 


ao =a and Qn = Bn(@o,@1,...;@n—1), for eachn > 1. 


The sequence is just a mapping a : N — A such that a(n) = a,, and we construct 
a as a set of ordered pairs in N x A (see Section 0.3). Call a subset A CN x A 
“nice” if it satisfies the following two conditions: 


(a) (0, a) is in A. 
(b) If (k, 2p) is in A for k =0,1,...,n—1, so also is (n, Bp(ao,@1,.--,@n—1)). 


Nx A is clearly “nice”. Let a denote the intersection of all “nice” subsets A, that is 
a= {(n,x) | (n,x) € A for every “nice” subset A}. 


It is a routine verification that a is itself “nice” . 
Claim 1. Given n € N, there exists 2 € A such that (n, 2) is in a. 


Proof. If n = 0 then (0, a) is in a because a is nice. If n > 0 let (k,z,) be in @ for 
k=0,1,...,n—1. Write « = 6, (x0,...,2n-1). Then (n,2) is in every “nice” set 
A by the definition of a, so (n, x) is in » by (b). It follows that (n,z) is in a. This 
proves Claim 1. 


Appendiz D Proof of the Recursion Theorem 491 


Claim 2. If both (n,z) and (n,y) are in a then # = y. 


Proof. If n = 0 suppose that (0,y) is in a where y # a. Consider a/ = a\{(0,y)}. 
It suffices to show that a’ is “nice” since this contradicts the choice of a. Clearly 
(0, a) is in a’. If (k,v,) is in a’ for k =0,1,...,n—1 then all these pairs are in a, 
so (n, Bn(Zo,.--;£n—1)) is also in a. But n ¥ 0 so this is actually in a’. This shows 
that a’ is “nice”. Now assume that Claim 2 is true for k = 0,1,...,2—1, and that 
(n,x) and (n,y) are both in a where x # y. Then a” = a\{(n,y)} is “nice” just as 
before, contrary to the choice of a. This proves Claim 2. 


Finally let n € N. Then (n,a,) is in a for some a, € A by Claim 1, and a, is 
unique by Claim 2. Hence ao, a1, a@o,... is the desired sequence. O 


Bibliography 


This list identifies some of the books that the interested reader can peruse for more 
information on the topics discussed in this book. The list is by no means complete. 


GENERAL ABSTRACT ALGEBRA 


Birkoff, G. and MacLane, S. A Survey of Modern Algebra, 4th ed., New York: Macmillan, 
1977. 

Cohn, P.M. Algebra, Vols. 1 and 2, New York: Wiley, 1974, 1977. 

Dummit, D.S. and Foote, R.M.. Abstract Algebra, 3rd ed., New York: Wiley, 2004. 

Herstein, LN. Topics in Algebra, 2nd ed., New York: Wiley, 1975. 

Hungerford, T.W. Abstract Algebra, 2nd ed., New York: Holt, Reinhart and Winston, 
1974, 

Jacobson, N. Basic Algebra, Vols. 1 and 2, San Francisco: Freeman, 1974, 1980. 

“Van der Waerden, B.L. Algebra, Vols. 1 and 2, 7th ed., New York: Ungar, 1970. 


NUMBER THEORY 


Burton, D.M. Elementary Number Theory, Boston: Allyn & Bacon, 1980. 
Davenport, H. Higher Arithmetic, New York: Harper, 1960. 

Hardy, G.H. and Wright, E.M. An Introduction to the Theory of Numbers, 4th ed., Oxford: 
Clarendon Press, 1960. 
LeVeque, W.J. Topics in Number Theory, Vols. 1 and 2, Reading, MA: Addison-Wesley, 

1956. 
Niven, I. and Zuckerman, H.S. An Introduction to the Theory of Numbers, New York: 
Wiley, 1980. 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


492 


Bibliography 493 
GROUP THEORY 


Hall, M. The Theory of Groups, New York: Macmillan, 1959. 

Kaplansky, I. Infinite Abelian Groups, 2nd ed., Ann Arbor: University of Michigan Press, 
1969. 

Kargapolov, M.I. and Merzljakov, Ju.I. Introduction to the Theory of Groups, New York: 
Springer-Verlag, 1979. 

Kurosh, A.E. The Theory of Groups, New York: Chelsea, 1960. 

Ledermann, W. Introduction to Group Theory, Edinburgh: Oliver and Boyd, 1973. 

Macdonald, 1.D. The Theory of Groups, London: Oxford University Press, 1968. 


Rose, J.S. A Course on Group Theory, Cambridge, England: Cambridge University Press, 
1978. 


Rotman, J.J. An Introduction to the Theory of Groups, 3rd ed., Boston: Allyn & Bacon, 
1984. 


RING THEORY 


Atiyah, M.E. and MacDonald, I.G. Introduction to Commutative Algebra, Reading, MA: 
Addison-Wesley, 1969. 

Herstein, LN. Noncommutative Rings, Carus Monograph 15, Washington, D.C.: Mathe- 
matical Association of America, 1968. 

Kaplansky, I. Commutative Rings, Chicago: University of Chicago Press, 1974. 

Lam, T.Y. A First Course in Noncommutative Rings, New York: Springer-Verlag, 1991. 

McCoy, N.H. Rings and Ideals, Carus Monograph 8, Washington, D.C.: Mathematical 
Association of America, 1948. 


McDonald, B.R. Linear Algebra over Commutative Rings, New York: Marcel Dekker, 
1984. 


FIELD THEORY 


Artin, E. Galois Theory, Notre Dame, Ind.: University of Notre Dame Press, 1944. 

Kaplansky, I. Fields and Rings, 2nd ed. (rev.), Chicago: University of Chicago Press, 
1972. 

Niven, I. Irrational Numbers, Carus Monograph 11, Washington, D.C.: Mathematical 
Association of America, 1956. 

Rotman, J. Galois Theory, New York: Springer-Verlag, 1990. 

Stewart, LN. Galois Theory, London: Chapman and Hall, 1973. 


RELATED BOOKS | 


Artin, E. Geometric Algebra, New York: Interscience, 1957. 

Curtis, C.W. and Reiner, I. Representation Theory of Finite Groups and Associative 
Algebras, New York: Wiley, 1962. 

Halmos, P.R. Naive Set Theory, New York: Springer-Verlag, 1974. 

Kaplansky, I. Set Theory and Metric Spaces, Boston: Allyn & Bacon, 1972. 

Lidl, R. and Pilz, G. Applied Abstract Algebra, New York: Springer-Verlag, 1984. 


494 Bibliography 


MacWilliams, F.J. and Sloane, N.J.A. The Theory of Error-Correcting Codes, New York: 
Wiley, 1952. 

Solow, D. How to Read and Do Proofs, 2nd ed., New York: Wiley, 1990. 

Wilder, R.L. Introduction to the Foundations of Mathematics, New York: Wiley, 1952. 


HISTORICAL 


Bell, E.T. Men of Mathematics, 2nd ed., New York: Simon and Schuster, 1962. 

Boyer, C.B. A History of Mathematics, New York: Wiley, 1968. 

Courant, R., and Robbins, R. What is Mathematics, Oxford: Oxford University Press, 
1941. 

Kline, M. Mathematical Thought from Ancient to Modern Times, New York: Oxford 
University Press, 1972. 

Newman, J.R. The World of Mathematics (4 Vol.), New York: Simon and Schuster, 1956. 

Van der Waerden, B.L. A History of Algebra, New York: Springer-Verlag, 1985. 


Selected Answers 


EXERCISES 0.1 PROOFS 


1. 


(a) If n = 2k, k an integer, then n? = 4k? is a multiple of 4. The converse is true: If 
n? = 4k, then n must be even because n odd implies n? odd. 

(c) Verify that 23—6-2?+11-2—6=0 and that 33-6-3?+411-3-6=0. 
The converse is false: 13-6-12+11-1—6=0 but 1 is not 2 or 3. Thus 1 is a 
counterexample. 


. (a) Either n is even or it is odd; that is, n = 2k or n=2k+1. Then n? = 4k? or 


n? = 4(k? +k) +1. 


. (a) If n is even , it cannot be prime unless n = 2 because, otherwise, 2 is a proper 


factor. The converse is false: 9 is an odd integer greater than 2, which is not prime. 
(c) If Va> vb, then (fa)? > (Vb)?; that is a>, contrary to hypothesis. The 
converse is true: If ,/a < Vb, then (./a)? < (vVb)?; that is a <b. 


. (alli feFy = Ve+ Vy, then at y = (/e+ Jy)? = 24+ 2,/a9 + y. Hence /zy = 0, 


from which xy = 0; therefore z = 0 or y = 0, contrary to hypothesis. 


~(ajn= is a counterexample because then n* +n+ as as a factor. 
1li le b h 2 11 has 11 fact 


EXERCISES 0.2 SETS 


a) {a | 2 = 5k where k € Zt} 
a) {1,3,5,7,...}= {2k+1|k EN} (c) {-1,1, -3} (fe) {} =O 


e) Not equal: 1€ A but 1¢B (g) Equal to {—1,0, 1} 

a) B,{2} (c) {1}, {3}, (1,23, 1,3},2,3}, (1,2, 3} 

a) True. As B € C, each element of B (in particular, A) is an element of C. 
(c) False. A = {1}, B= C = {{1}, 2}. 


( 
( 
. (a) Not equal: -1 € A but -1¢B (c) Equal to {a, l, 0, y} 
( 
( 
( 


. Every element of AN B is in both A and B by definition, so AN BC Aand ANBCB. 


If X C A and X C B, then x € X implies that « € A and x € B; that is, rE ANB. 
Hence, X C ANB. 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Inc. 


495 


496 


11. 


Selected Answers 


(a) (z,y)€ Ax (BNC) if and only if se A and ye BNC; if and only ifreA 
and y € B, and c€ A and y € C; if and only if (2, y) € Ax B and (z,y) € AX C; 
if and only if (z,y) € (Ax B)N(Ax C). Hence A x (BNC) and (Ax B)N(Ax C) 
have the same elements. 


EXERCISES 0.3 MAPPINGS 


1. 


(a) Not a mapping: a(1) = —1 is not in N. 

(c) Not a mapping: a(—1) = /—1 is not in R. 

(e) Not a mapping: a(6) = a(2- 3) = (2,3) and a(6) = a(1-6) = (1,6). 
(g) Not a mapping: a(2) not defined. 


. (a) Bijective (c) Onto, but not one-to-one 


(e) One-to-one but not onto (g) One-to-one but not onto if |B] > 2 


. (a) Ife EC, then c= Ba(a) = Bla(a)] for some a € A. As a(a) € B, B is onto. 


(c) If 6(b) = B(bi), write b = a(a) and by = a(a1), where a,a, € A. Then B[a(a)] = 
Bla(a,)]; that is, Ba(a) = Ba(a;). Because Ba is one-to-one, a = a;, which yields 
b= a(a) = a(ay) = bi. 


1 
- (a) aYy) = =(y—0) (Jat=a 
. If Ba =14, then a is one-to-one so, as |A| = |B| is finite, a is also onto. Hence 


a! exists so a@t=1lyat = Bao! =Blg—6. Then afS=aat=1g and 
pi=(c)1=a, 


. pl (z,y) = a2 where a(1) = a and a2(2) = y. 
-(b) = (c). If ay=ad, then y=1ay= (Ga)y = Bley) = Bled) = (Ba)d = 


146 = 6. 


. (c) = (a) If bbe B—a(A), choose ag€ A, and define 6: B > B by: 


b, if b # do, 
Deduce bo = a(ag) using (c). 


Ab) = aay), if b = bo. 


EXERCISES 0.4 EQUIVALENCES 


1. 


10. 


(a) Equivalence: [1] = [0] = [—1] = {1,0,-1}, [2] = {2}, [-2] = {—2}. 
(c) Not an equivalence: x = @ only ifa=1. 

(e) Not an equivalence: 1 = 2 but 2 #1. 

(g) Not an equivalence: 2 = x is never true. 

(i) Equivalence: [(a,b)] = the line with slope 3 through (a, b). 


» (a) As = {{(1, 9], {G, 2)], (4, 3)], (2, 3)], (3, 3)3 


(c) As = {((1, D}, (2,01, (3, DF 


. (a) Kernel equivalence of a: Z-+ Z, where a(n) = n*; o[n] = |nl. 


(c) Kernel equivalence of a: Rx RR, where a(z,y) = y;o[(z, y)] = y. 


. (a) Not well defined: a(2) = a (2) =2 and a(2) =a (4) = 4, 


(c) Not well defined: a ($) =3 and a(4) =a (2) =6. 
(c) |A..| = |Q| =n by (c) of the preceding exercise. 


EXERCISES 1.1 INDUCTION 


2. 
3. 


i oe sje , 1 Vike +k+1 k+l _ ./Pap7 
(©) tot yet gem 2 VE + vert = vert = VE +1. 


(c) If 37441 4 9'+2— 7m, then 324+349'+3 — 7(9m — 2*+?), 


18. 
24, 


Selected Answers 497 


. Clear if m = 1. In general, such a (k +1) digit number must end in 4, 5, or 6, and 


those are 3* of each by induction. We are done since 3 -3k = 3*+?. 


. (a) If k > 2 cents can be made up, there must be a 2-cent or a 3-cent stamp. In the 


first case, replace a 2-cent stamp by a 3-cent stamp; in the second case, replace a 
3-cent stamp by two 2-cent stamps. 


. If pi is true and pp > pry4i, show X = {n | py is false} is empty. 
. If py, is “n has a prime factor”, then po is true. Assume po,...,p, are all true. If 


k+1 is a prime, we are done. If k+1=ab write 2<a<k and 2<b<k, then 
a (and b) has a prime factor by strong induction. Thus, k + 1 has a prime factor. 
(a) a, = 2-1)" (c) an = 3[1+(-1)"] 

(a) Verify p; and po. (c) Verify p1,p2,-.-,Pr0- 


EXERCISES 1.2 DIVISORS AND PRIME FACTORIZATION 


19, 


27. 


30. 
31. 
33. 
41. 


. (a) (a) 391 = 23-17+0 (c) -116 = (~9)13 +1 
. (a) n/d = 134.293..., so g = 184. Then r = 113. 
. (a) 6=3-72—5-42 (c) 3=1-327-6-54 

(ce) 29 =0:377+1-29 (g) 1=—-17-72—7-(—175) 


. (a) Ifd=a2m-+ yn, where z,y € Z, then l= 24 +y%. 
. Ifd =ged(m,n) and dj=gcd(m,,n,), then d| mand d|n, so d|m, and d|n, by 


hypothesis. Thus d | d,. 

If d= gced(m,n) and d, = gcd(km,kn), then d|m and d|n, so kd|km and kdjkn. 
Hence kdjd;. To show that d|kd, write km= qd, and kn=pd,. We have 
d=am-+yn where z and y € Z, so kd = xkm-+ ykn = xqd; + ypd;. Thus d,|kd. 
Let d= ged(m, p*). Then d|p* so d=p’, 7 <k. Show that j >0 contradicts 


ged(m,p*) = 1. 
(a) 3478 (c) 11-13-17 (e) 241 
(a) 5 and 16,170 (c) 139 and 278 


(a) 25,200 has 90 positive divisors. 
(a) gcd(28, 665, 22, 869) = 63, and Icm(28, 665, 22, 869) = 10, 405, 395. 


EXERCISES 1.3 INTEGERS MODULO n 


(a) True (c) True (e) True (g) False 
. (a) k =2 (mod7) (c) k=0 (mod 9) 
. (a) 2, 5, 10 (c) 3 

(a) 7 

(a) 7 


. One of a,a+1 is even so 2|a(a+1)(a+2); similarly, one of a,a+1,a+2 is a 


multiple of 3. Since ged(2,3) = 1, it follows that 2.3 =6 divides a(a+1)(a+ 2). 


. Compute a? forO<a<b. 
i (al 27; 2 = 3s (c) 11, s = 16 
. (a)o=8, y=5 (c) No solution 


(e) («,y) = (0,4), (1,6), (2,1), (8,3), (4,5), (6,0), 6, 2). 


. (a) 3, 6 (c) No solution 


498 Selected Answers 


31. (1)>(2). Let n = p*a where p is a prime and p |/a. Ifa > 1 then ged(n,a) =a > 1, 
so @ has no inverse in Zn. By (1), let 4 =0 in Zp. Then n| a‘, so p|a*, so p|a, 
a contradiction. 

35. (a) Working modulo p, z?=1 means z7—1=0. Thus (x—I)(z+1)=0 so z=1- 
or x = —1 by Theorem 7. 


EXERCISES 1.4 PERMUTATIONS 


Oe car Olea Ws oa) 
SOG He Olga a) 
7. (a) 24 


123 45 67 8 9 
Oe a) 


13. (a) (1 4839527 6) (c) (1 2 8)(3 6 7)(4 9 5) 
(e)(1 3 8 7 2 5) 

17. (a) (1 4 3 2)(5 7 6) 

18. Odd 

19. (a) Even (c) Even (e) Odd 


25. It suffices to show that any pair of transpositions is a product of 3-cycles. If k, 1, m, 
and n are distinct, this follows from (kl)’=e, (k1)(mn) =(kml)(k mn), and 
(kI)(k m) = (kml). 


EXERCISES 2.1 BINARY OPERATIONS 
1. (a) Not commutative or associative; no unity, so no units. 


a 
a—1 


(c) Commutative, associative, unity is 0; ifa #1,at= 
(e) Not commutative, associative; no unity, so no units. 
(g) Commutative, associative; no unity, so no units. 

(i) Not commutative, associative, unity is (1,0,1); if c #042, (#,y,z) += 


(=. —4,-) 
xz’ az’? 2) 


a b 
3. (a) aja b 
b}|b oa 


7. MxN is commutative if and only if both M and N are commutative. (m,n) is a 
unit if and only if both m and n are units, and then (m,n)~+ = (m71,n“1). 
9. (a) a%4a = a*5= (a°)°= (b°)°= b= b*4b = a*4d. Cancel a 24 times 
13. If (uv)w = 1, show that (vw)u = ly. 
18. (2)>(1). If (ab)"'=« then (ab)z=1, so a(bx)=1. Then (br)a=1 by (2). So 
a is a unit. Similarly for b. 


EXERCISES 2.2 GROUPS 


1. (a) Only 0 has an inverse. 
(c) Group; unity is -1, a7! is —a — 2. 
(e) Not closed: (1 2)(1 3)=(1 3 2) is notinG. 
(g) Group: unity is 16; each element is self-inverse. 
(i) 2 + 2n has no inverse in G. 


Selected Answers 499 


3. (a) First ad = c,a?= d by the Corollary to 
Theorem 6. Next ba # b,a,d; and 
ba =c > b=ac=a(ba) = (abla=la=a, 
a contradiction. So ba = 1. Then bd =a, 
be = d, b? = c. Next, ca=b, cd =1, c? =a, 
cb = d. Finally, da = c,db = a,dc = 1,d? =b. 
8. (a) Every element o satisfies 0? = e. 

13. a is onto because a(g-!) =g for all g €G; a is one-to-one because g-? =h7} 
implies that g = (g-!)"1 = (h-1)-1 =A. 

23. (a) If g=g"!, then g?= gg"1= 1; if g?= 1, then g-"t= 9711 = g-49?=g. 

29. (a) We first. establish left cancellation: If gx = gy in G, then z= y. In fact, let 
hg=e. Then gx=gy implies c=ex=hgzr=hgy=ey=y. With this, the 
fact that hg=e=e-e=hge gives g=ge by left cancellation. This shows that 
e is the unity. Finally, h(gh) =(hg)h=eh=h=he, so gh=e, again by left 
cancellation. Thus, / is the inverse of g. 


EXERCISES 2.3 SUBGROUPS 


1. (a) No. 1+1 is not in Z. (c) No. 3? = 9 is not in H. 
(e) No. (1 2)(3 4)-(1 3)(2 4)=(1 4)(2 38) is not in H. 
(g) Yes. 6 = 0 is the unity. (i) Yes. 


6) Valeo) y= nh? they = Gh) and a = (g"1)?: 
. (a) 1=g°, g¥g™ = g**™, and (g*)-! = g-*; the subgroup test applies. 
8. (a) 1=ae te (X). Clearly (X) is closed. If g=ah!...ckme X, we obtain 
gi=a,*m... zy"! X. Hence, (X) is a subgroup; clearly « = z!€ X so X C (X). 
Cs 
15. (a) {1} and Cs are the only subgroups of Cs. | 


{1} 


(c) fe}, Ka = {e, (1 2)}, Ke ={e, (1 3)}, Ka = fe, (2 3)}, 
H={fe, (1 2 3), (1 3 2)}, and S3. 


Je ee 


Ky Ke K3 


\E 


17. >. FH ZK, letthe H-K.Ifke€ K, show kh ¢ K, hence kh € H, whence k € H. 


J 


EXERCISES 2.4 CYCLIC GROUPS AND THE ORDER OF AN ELEMENT 


1. (a) 9,97,9°,.9% (c) 99°,9°,99°,9",9",9"° 
2. (a) 1,2,3,4 (c) 1,3,5, 7,9, 11, 13, 15 
4. (a) Zt = (3) 

(c) Zig is not cyclic; o(7) = o(9) = 2; 0(3) = 0(13) = o(5) = o(11) = 4 
7, (a) 10 (c) 4 


500 


11. 
16. 


17. 
25. 


27. 


Selected Answers 
. (a) (1 2 3)(4 5) (a w) 
(a) (b) . te) 
G G 
| ws 
(9°) (9°) 9) e 
| de ene ane 
(9*) sae ye (9?) (9%) 
| 
{1} (1) ~~ {1} a 
(a) If G = (a) where o(a) =n, let g = a*. Then g"= (a*)"= ab"= (a”)*= 1= 1. 
(a) H=G (c) H = (a*) 


(e) H={(1, 1), (a,b), (a,b), (a°, b*), (a,b), (a, 6°), (a?, 1), (1, 67)} 

={(a*,b™) | k +m is even} 
(a) X CY C(Y), (Y) a subgroup, so (X) C (Y) by Theorem 8. 
Let G = (g) and H = (hk) where o(g) =m, o(h) =n. As |G x H| = |G||H| = mn, 
it suffices to show that o((g,h))=nm. We have (g,h)””"= (g"™,h”™) = (1;1). 
If (g,h)*= (1,1), then gt=1 and hk=1, so m|k and n|k. But ged(n,m) = 1, 
then implies nm | k (Theorem 5 §1.2)), so o(g,h) = mn, as required. 
(a) If ACB, show that g*=g°%, gE Z. Since |g) = co, a=gb. Conversely, if 
a=qb, then g°€ B,so ACB. 


EXERCISES 2.5 HOMOMORPHISMS AND ISOMORPHISMS 


8. 


12. 
19. 


23. 


25, 


31. 


33. 


(a) If a:Z—4Z, let a(1) =m. Then a(k)=a(k-1) =k[a(l)]=km. Thus, a 
is multiplication by m, and each such map is a homomorphism Z—Z. 

(a) Yes. (c) No, not one-to-one. (e) Yes. (g) Yes. (i) Yes. 

(a) If z € Z(G), then o(z) € Z(G;) because, given g, =o(g) in Gi, o(z)-H= 
o(zg) = o(gz) = gi :o(z). Hence, o: Z(G) — Z(G) is a mapping. It is one-to-one 
because o is, and o(zw) =o(z)-a(w) clearly holds. If 2, € Z(G,), let 21 = o(z), 
zé€G. If g eG, then o(gz) =o(g)- 21 = 2 -0(g) =0(zg), so gz = zg because a is 
one-to-one. Thus, z € Z(G), and a is onto. 

Write w = e?™/SEC°. Suppose ¢:C°— R* is an isomorphism, write o(w) =r. 
Then r3= [o(w)|®= o(w*) = o(1) =1, so r=1, so o(w)=1, so w=1, a con 
tradiction. 

Z is infinite cyclic, so Q = Z is infinite cyclic too, say Q = (q) = {kq|k € Z}. In 
particular g?= koq, so q = ko€ Z. Thus, Q = {kko| k € Z} C Z, a contradiction. 
Ifo: G— G is an automorphism, then o(c(a)) = 2, so o(a) = a. Because o(1) = 1, 
o =1g and autG = {lc}. 

Let G = (a), o(a) = 00. If o(a) = a™, m € Z, then a = o(a)* = a™*. As o(a) = & this 
gives 1 = mk, whence m = +1. Ifm=1, then o = 1g; ifm = —1 then o(g) = g7* for 
all g EG. 


EXERCISES 2.6 COSETS AND LAGRANGE’S THEOREM 


1. 


(a) 1H ={i,at,a8,al?, al} 
aH = {a,a°,a°, a}, a}? 

a? H = {a?,a®, a), a!4, a}8} 

a3 H = {a3,a7,a!, 038, a!9} 


Selected Answers 501 


Ti {1 a? at aoa? a al al.) a} 
ak = {a,a",a))a' 0,0" a4" ata} 
(c) 0+H={2k|k eZ} 
0+K={3k|k eZ} 
1+ H={2k+1|keZ} 
1+K={8k+1|k eZ} 
2+K={3k+2|keZ} 


. (a) a=a because ata=1eH. If a=b then b4a€H, and it follows that 


ab = (b-1a)"“*€ H, sob =a. Finally, ifa = bandb=cthenb'a € H andcbe H, 
so c-ta = (c71b)(b-4a) € H andaz=c. 


. (a) The sets of positive and negative numbers. 


(c) If0 <t<1,t+2Z is the set of numbers at distance ¢ to the right of an integer. 


. (a) 6 


. (a) If o(g) =m, we show m=12. We have m|12 by Lagrange’s theorem. If 


m #12, then m | 4 or m|6, so g*= 1 or g§= 1, contrary to hypothesis. 


. (a) If 1 = am-+yn, where z,y € Z, then g = g! = (g™)*(g”)¥ = 1714 = 1. 
. (a) Because a*ba® = b implies that a*t+1ba*+1 = aba =b, it holds for k>0. But 


-1 


aba = b gives b = a~'ba“!, so a~*ba-* = b follows for k > 1 in the same way. 


_ If |: K|=n, let Khi,..., Kh, be the distinct cosets of K in H. This means 


H=Kh,U-:-UKh,, a disjoint union. Then Hg C KhigU-:-UKh,g is clear, 
and it is equality because K C H. Thus, each coset of H in G is the union of n 
K-cosets. If |G: H|=m this gives |G: K|=mn=|G: H| |H: K|. Conversely, 
if |G: K| is finite, then |H : K| is clearly finite and |G: H| is finite by the hint 
since each H-coset is a union of K-cosets. 


EXERCISES 2.7 GROUPS OF MOTIONS AND SYMMETRIES 


3. 
4, 
6. 


(a) Ifo =(1 2 8), the group of motions is (a) = {1,0, 07}. 

(a) Ifo=(1 2 3 4), the group of motions is (c) = {1,¢, 07, 0}. 

(a) Ifo=(1 2 3)(4 5 6) andr=(1 4)(2 6)(3 5), the group G of motions is 
G = {e,0,07,7T,70,T0*} & Ds. 


EXERCISES 2.8 NORMAL SUBGROUPS 


11. 


15. 


. (a) Not normal (c) Normal 
. If Dy = {1,a,a7, a3, b, ba, ba”, ba?}, where |a| = 4, |b] = 2, and aba = b, the normal 


subgroups are {1}, Dy, Z={1,a”}=Z(D,), H=(a), Ky, = {1,a?,b,a7b} and 
K, = {1,a?, ba, ba}. 


. First aKaq! is a subgroup, by Theorem 5 §2.3, and aKa'C aHa'C H because 


H<aG. If hE H, we must show h(aKa')h'C aka. We have h-!Kh= K 
as K<dH, so h(aKa")h'=ha(h 'Kh)a‘h-= (hah ')K(hah") ‘= K 
because K AG. | ' 

Let H and K be subgroups of G with |H|=p and |K|=q. Then HNK = {1} 
by Lagrange’s theorem. Moreover, H << G bécause it is unique of its order, and 
similarly K <a G. Hence, GH x K by the Corollary to Theorem 6. Since p and 
g are primes, H and K are cyclic of relatively prime orders. Hence, H x K is cyclic 
by Exercise 25 §2.4. 

(a) Conclude that Ka = GK = Kb"!, soabe K. 


502 


17. 


24, 


26. 


Selected Answers 


(a) If H =(a*), d|n, let n= md. Since ba* is self-inverse for all k, we have 
(ba*)-'a4t(ba*) = ba*attba® = b(a***)ba* = b+ b+ a *-4 gk = bPaka-Hak = 
a*e H, , 

(c) Let G = Cz x Co, where C2 = (a), o(a) = 2, and let H=C x {1}. Than HAG 
because G is abelian. But o: G > G given by o(z,y) = (y, 2) is an automorphism 
and o(H) ¢ H. Thus, H is not characteristic in G. 

(a) Write K =coreH =N,aHa™. Clearly 1€ K. If g9,9,€ K, then g9,€ aHa™ 
for all a, so gg,€ K. Also, g-4€a-!H(a')”’ for all a, so g€ K. Hence, K is 
a subgroup. If g€éG, ke K then gkg te gl(g-*a)H (97a) "|g" = aHa™ for 
all a, as required. 


1 


EXERCISES 2.9 FACTOR GROUPS 


1. 


(a) If Dg = {1,a,...,a°,b, ba,...,ba®}, where o(a) = 6, 0(b) = 2, and aba = b, then 
K = {1,a°} by Exercise 26 §2.6, and Dg/K = {K, Ka, Ka’, Kb, Kba, Kba?}. 
De/K K Ka Ka? Kb Kba Kba? 
K K Ka Ka? Kb Kba Kba* 
Ka Ka Ka? K Kba? Kb Kba 
Ka? Ka? K Ka Kba Kba? Kb 
Kb Kb Kba Kba? K Ka Ka’? 
Koba Kba Kba? Kb Ka? K Ka 
Kba? | Kba* Kb Kba Ka Ka? K 


(c) K(a,b) = K(1,b) because (a, 1) € K. Thus, G/K = {K(1,b) |b € B}. Moreover, 
K(1,b)- K(1, 61) = K(1,bb1), so the Cayley table is determined. Remark: The map 
K(1,b) + 6 is an isomorphism G/K — B. 


3. (a) 6,4,3, 12 
A, (a) 12 
5. (a) 1,2,2,2 \ 
_ 7 If 0<n<m in Z, then Z+4 4Z++ because 1-+¢ Z. Hence, Q/Z contains 


13. 


19, 


25. 


the infinite set {Z++|n>1}. Now let Z+™ be any element of Q/Z. Then 
uUZ+@) =Z+m=Z, so Z+™ has finite order, 
(a) If z € Z(G), then Kz € Z(G/K), so z € K by hypothesis. But then z € Z(K), 
soz=l1. 
(c) Given z € G, let (Kz)?"” = K. Then z?” € K, so (z?")?” = 1; that is, 2? 
Hence o(z) divides p*+™, so o(z) = p* for some k > 0. 
(a) G'= {1} 
(c) Di = (a?) where Dg = {1,a,...,a°,b,ba,...,ba®}, o(a) = 6, o(b) = 2, aba = b. 
(a) [Ka,Kb]=Ka-Kb- Ka. Kb"! 

= K(aba-!b-) 

= K[a, b] 


rb in 
= ly 


EXERCISES 2.10 THE ISOMORPHISM THEOREM 


4, 


(a) 1e€a*(X) because afl) =1EX; if g and Rea 4(X), then a(gh) = 
a(g)ja(h) € X and a(g)+=a(g) € X, shows that gheéal(X) and ge 
a (X). If X a a(G), let hE a 4(X), g EG. Then a(ghg) = a(g)a(h)a(g) € 
a(g)Xa(g)~1 = X, so ghg™* € a 1(X). Hence, a1 (X) AG. 


8. 


10. 


13. 


17. 


21. 


33. 


Selected Answers 503 


(a) If Cs = (g), |g] =6, then the choice of a(g) € K4 determines a: Cs > Kg. If 
a(g) = 1, then a is trivial. If ¢ #1 in K4, then o(z) = 2 and we define ag : Cg — Ky 
by az(g*) = z*, This mapping is wel! defined because 


Hence, a, is a homomorphism and a,(g) =x. Thus, these are the only nontrivial 
homomorphisms. 
(c) Let D3 = {1,a,a?, b, ba, ba*} = (a,b), where 


o(a) = 3, o(b) = 2, and aba = 6, and let C4 = (c), Oe So, 
o(c)=4. If a:Dg—-Cy is a homomor- 

phism, write K=kera. Then K<d Ds, so o| 4 
K={i}, K=(a) o K=D;3. Now K={l} G/K 


is impossible as a is not one-to-one (|D3| = 6 does not divide |C4| = 4). If K = Ds, 
then a is trivial. So assume that.a is not trivial. Then a(G) = G/K = {K,bK}, so 
a(G) is the unique subgroup of C4 of order 2: a(G) = {1,c*}. If p:G@—-G/K is 
the coset map, there is an isomorphism o : G/K — a(G) such that a = oy. Clearly, 
o(K) =1 and o(bK) = c?. Hence 

a(bra™) = oy(bka™) = o(b*a™ K) = o(b* K) 

=o[(6K)*} =[o(bK)]* =e 

This is the only nontrivial homomorphism. 
(a) No. If a: Sg — K4 were onto, then K4 = S3/kera, and |K4| = 4 would divide 
|S3| = 6. 
(c) Yes. S3/A3 = C2, say o : S3/A3 — Cy is an isomorphism. If y: S3 — S3/Az3 is 
the (onto) coset map, then oy : S3 —* C2 is an onto homomorphism. 
Let G be simple. If a: G— G is nontrivial, ker a #4 G, so ker a= {1} by sim- 
plicity. So a is one-to-one and G = a(G) C G,. Conversely, if G, has a subgroup 
Go and a: G— Gp is an isomorphism, then ¢: G-» Gy is a (one-to-one) homo- 
morphism, which is nontrivial because Gp # {1} (it is simple). 
(a) If g€G’, write g= [a,,b;}[a5,b.]---[a,,0,], where [a,b] =a ‘bab. Then 
cx(g) = afd, By] +++ 0f0,, bn] = [0r(ay, ), (b,)] ++ [or(a,), 0(0,)} © C4. 
(a) Define a:C*> Rt by a(z)=|z| (|2z|>0 because z#0). Then a is a 
homomorphism because |zw|=|z| |w], and kera = {z||z|=1}=C°. Thus, the 
isomorphism theorem gives C*/C°& a(C*) & Rt. 
(a) Zs has subgroups H = {0}, {0,2}, and Z4. Hence Z4/H & Zs, Zo, {1}, so these 
are the possible images. 


EXERCISES 2.11 AN APPLICATION TO BINARY LINEAR CODES 


15. 


» (a) 5 (c) 6 

. (a) 3 (c) 7 

. (a) Detects 3, corrects 1. 

. (a) As k=3 and t= 2, n must satisfy (4) + () + (3) <2r-3, If n=3,4,...,8, 


) 
this expression reads 7 <1, 11 <2, 16<4, 22<8, 29<16, and 37 < 32. Hence 


n > 9. [Note: For n = 9, it reads 46 < 64.] 

(a) If C is a (4,2)-code that corrects one error, the weight of C must be at least 3 
so that the nonzero words in C are contained in {1111,1110,1101,0111}. But the 
sum of any two of these words is not in the set. 


504 Selected Answers 


10021011 
20. (a)G=[1 111 1] oo=foreet sy] 
0011110 
10141 
Del Ava 0111 
100 0 1110 
H=j]010 0 H=/1 00 0 
001 0 010 0 
0001 001 0 
0001 


21. (a) {0000, 1011, 0100, 1111} 
(c) {000000, 100101, 010110, 001001, 110011, 101100, 011111, 111010} 


EXERCISES 3.1 EXAMPLES AND BASIC PROPERTIES 


1. (a) Not an additive group. 
(c) h(f +g) #hf +hg can happen (try h(x) = 2”). 


ab oo '4be!  ab!-+bd! 
3. (a) be | E | = ey ob dd! € S because the column sums are 


(aa'+be’) + (ca'-+de') = (a+ c)a'+(b+d)c'= (a+c)(a'+c) 
(ab'+bd’) + (cb'+dd') = (a+ c)b'+(b+d)d'= (a+ c)(b'+d') = (a+ c)(a'+e’). 
The rest of the subring test is routine. 


7. { [; *| ac 2(R)} 
14. Compute (1+ sr)[1 — s(1+rs)"'r]. 


15. (3) = (1). Let e be the unique right unity. Given b € R, show that r(e+eb—b) =r 
for all r € R. Now use the uniqueness. 

16. (a) Zz is a subring by Theorem 5. It is centered because s- (kl,) = ks = (k1,); for 
all s € R by Theorem 2. 

18. (a) lem(m,n) (c) 0 

21. (a) (1 —2e)? =1—4e+ 4e? =1 

22. (a) If a=(i—e)re then the fact that e?=e gives ea =0 and ae=a. It follows 
that a?= (ae)a = a(ea) =a-0=0. 

23. (4) > (1). If re R, a=(l-e)re is nilpotent so u=1+a is a unit and so 
commutes with e by (4). Conclude that re = ere. 

29. (a) Units = {1,—1}; nilpotents = {0}; idempotents = {0, 1} 

6 1 0 aa Ol pe | 1 0 11 
Can k "| i | b | li: i I: 4 i 4] 
0 ool 0 0 11 
o o}’to of*{1 of? la 1]? 


0 07] fl 0 Lol 1 0] [Oo 07 fo 1 1 07 Jo 0 
idempotents: [o o][o o}e[o ole: lls alee ato oles a]: 
36. (a) Ifo: C —R is an isomorphism, then a = o(2) satisfies a? = —1, a contradiction. 
(c) If Z&Q, then Z is a division ring, a contradiction. 


Nilpotents: 


EXERCISES 3.2 INTEGRAL DOMAINS AND FIELDS 


1. (a) 1, —4 (c) 0,1 
3. Idempotents = {0,1}; nilpotents = {0} 


23, 


26. 
29. 


32. 


Selected Answers 505 


. If ab = 0 show that (ba)? = 0. 
. Try Z, for various primes p. 
. If z€Z(R) and za=1, showing that a€Z(R) is sufficient. Given re R, 


(ra —ar)z =raz—arz=r-:1—1-r=0, so (as za=1) ra—ar =0. 


. (a) Ifo #a€ K and ab = 1 withdb € K, conclude that b = a~! where a7! is the inverse 


in F, 


. Q(V2) is a subfield of R by Example 4. If F is any subfield of R then ZC F 


(because 1¢€ R), and hence QC F (because 2=nm='e F for all nsm#0 in 
Z). If also V2 € F, this means r + sV2€ F for all r,s € Q. Thus Q(V2) C F. 


If R= {ry,r2,...,tm} and O#a€R, then arj,arg,...,ar, are distinct (arj= ar; 
implies r;= r; as a #0). Hence {ar,,are,...,arn} has n elements, and so equals 
R. Hence, 1 = ar; for some 7. 

U s s! 
(a) If war and === show (rs)(u'v') = (ru!)(sv’) = (r'u)(s'r) = (r's!)(wo). 


(a) If r=% and s=1 in C, consider a=r+sw in C(w). Then aa*=r’+s?=0, 
but a #0 and a* £0 in C(w). Thus, C(w) is not a field. In Zs(w) let a= 1+ 2w. 
Then aa*t= 1°+2?=0, and a#0#a*. So Zs5(w) is not a field. However, Z7(w) 
is a field. If a=r+si#0 in Zy(w) then aat=r?+s? and it suffices to show 
r2+s? #0 in Zz. Suppose r?+s?=0. If r=0O or s=O then a=O, contrary to 
hypothesis. Thus r#O0#s. Then 0=s7l(r?+s?)=(s'r)?4+1 so (sr)? = 
~1 in Z,. This is not the case because 0?=0, 12=1= 67, 2?=4=57, 3?=2= 4? 
in Zr. 

(a) If q is a unit in H(R), then 1= N(1) = N(qq*) = N(q)N(q'), 80 N(q) is 
a unit in R. Conversely, gq*= N(q) shows q7!= N(q)‘¢* if N(q) € RB’. 


EXERCISES 3.3 IDEALS AND FACTOR RINGS 


Os © 


10. 


11. 
14. 


15. 


21. 
22. 


. (a) No (c) Yes (e) No 

. (c) If R is commutative then (r+ A)(s-+A)=rs+A=sr+A=(s+A)(r+A). 
. Let ab € Z x 0 where a = (n,m), b = (k,1). Show that ml = 0. 

. (a) A= R because i is a unit in R. So |R/A| = 1. 


(c) R/A={0+A,1+A,2+A,3+4A,4+ A} and |R/A] =5 
If0# 26 Z(R), then Rz is a nonzero ideal of R (it contains z #0) and so Rz=R. 
Hence 1€ Rz, say 1=az. It is enough to show that a€ Z(R). If rE R, then 
z(ra ~— ar) = r(za) — (za)r =r —r =0. Hence 

ra — ar = 1(ra— ar) = az(ra—ar) =a0=0 

so Tra = ar, as required. 
(a) Show that A = {r € R| nr = 0} is an ideal of R. 
(a) X+Y is a subgroup because (1) O=0+0EX+Y; (2) if r=a+y is 
in X+Y then -r=(-2)+(-y)EX4+Y; and (3) if also r’=a'+y' is in 
X+Y then rt+r’'=(24+a2')+(yty)EX+Y. We have XCX+Y because 
e=x2+0EX+Y forallcee X. Similarly Y CX+Y. 
ANS is clearly an additive subgroup of S. Ifa € ANS and s € S, then sa € A because 
A is an ideal and sa € S because a € S. Thus, sa € ANS; similarly, as Ee ANS. 
(a) (r+ A)?=1r?+A=r-+-A for all r € R because r?= r by hypothesis. 
(a) If e?=e in R, show that (e+ A)*=e+A in R/A. By hypothesis, e € A or 
1—e€ A, But 0 is the only nilpotent idempotent. 


506 


23. 
25. 


33. 


34. 


Selected Answers 
(a) 0 (c) 2R = {0,2,4,6,8} 
(c) If u is a unit then u € Ru implies Ru = R by Theorem 2. Conversely, if Ru= R 


then 1 € Ru, say 1= vu, v € R. Hence, u is a unit (R is commutative). 
(c) In M2(R), A= [? | is nilpotent. If B= 5 | then BA= ° "| 
is not nilpotent. 

(c) Write Zpn= Z/p"Z. If A=Z/kZ#0 is an ideal of Zpyn, then p"ZCkZ, so 
k |p". Hence, k=p* for t<n, so AC M where M=Z/p/Z. It follows that M 
is the unique maximal ideal of Zpn, so Zpn is local and J(Z,.) = M. 


EXERCISES 3.4 HOMOMORPHISMS 


10. 


13. 


17. 


21. 


29, 


37. 


41. 


. (a) No (c) No (e) Yes 
. If 0:Z—Z is a general ring homomorphism, let 6(1) =e. Show that e?=e so 


either e = 1 or e = 0. In the last case 0(k) = 6(k-1) = A(k) - A(1). 


. O and RF up to isomorphism 


(4) Clearly, @(r°) = @(1) =1=O@(r)°. If O(r")=O(r)” for some n>O0, then 
O(r™t1) = O(r™ -r) = O(r”) - O(r) = O(r)” - O(r) = O(r)"*. 

(5) Note first that @(u)-@(u-*) = @(uu-!) = 6(1) =1 and, similarly, that 6(u7+) - 
O(u) = 1. So 6(u7*) = O(u)71. If k > 0, then (4) gives (5): If k= —m, m> 0, then 
8(u*) = Bf(u4)] = (u4)™ = [0(u) 31 = 0(u)*. 

In Z7 this is 4n?= 2 and this has a solution (n = 2) in Zy. In Zi it is 7m?=9, or 
m= 8-9=72=6. But m?=0,1,3,4,5,9 in Zy1, so there is no solution. 

RR for any ring R because 1p: R—R is an isomorphism. If R&S, say 
o:R-—>S is an isomorphism, then o-1:S5-+R is also an isomorphism, so 
SR. If also S@T, where +: S —-T is an isomorphism, then to: R-T' is an 
isomorphism and R = T. 

If 9: C > R is a ring homomorphism, then ker 6 # C because 1 ¢ ker @ (@(1) = 14 
0). Thus ker 6 = 0 because C is a field, from which C & 6(C) CR. If 6(i) =a, then 
a? = 6(i)? = 0(i?) = 0(—1) = —1, a contradiction. 

(a) The map @ : R(w) + £(w) given by 0(a + bw) = @ + bw is an onto homomorphism 
with kernal A(w). 

Define 0:Z—> Z,, x Z, by 6(k) = (k+mZ, k+nZ). Show that @ is a ring homo- 
morphism and ker(@) = tZ. 

If 6= 4(1+uy/2), show that e? =e. If o: R R(V2)e is defined by o(r) =re, 
show that o is a ring isomorphism. Hence R & R(V2)e. 


EXERCISES 3.5 ORDERED INTEGRAL DOMAINS 


3. 


(a) Ifa > 0, then |a| = a> 0. Ifa <0, then -a=0-—a€ Rt, so lal] =-a>0. 
(c) If a=0 or b=0, then ab=0 and |ab| =0= |al|b|. Assume that a #0 and 
b#£0. 

(1) If a > 0 and b > 0, then ab > 0, so jabj = ab = |al|b). 

(2) Ifa > 0 and b < 0, then ab < 0, so |ab| = —ab = a(—b) = |al|bJ. 


Selected Answers 507 


(3) If a < 0 and b> 0, the argument is like (2). 
(4) Ifa <0 and b <0, then ab > 0, so |ab| = ab = (—a)(—6) = |al|d]. 
Hence, |ab| = |a||b| in every case. 


EXERCISES 4.1 POLYNOMIALS 


3. (a) 500 

. (a) In Ze: 4, 5, 1, 2; mm Zr: 4, 5 

. (a) In Za: 0, 1; in Ze x Zo all four elements are roots; in Ze: 0, 1, 3, 4 

. (a) Let ux” and bx™ be the leading terms of f and g, where u is a unit. The 
leading term of fg is ubz”*™ because ub #0 (otherwise b = u-'(ub) = 0). Hence, 
fg #0 and deg(fg)=n+m=deg f+deg g. 

14. (a) gq=a0° + 30? —-3824+5,r=—-r@-3=52+3 


Noo 


(c) q = 3a? + 22 +3, r(z) =7 (ec) q=3a+2, r(x) = —142 —3 
16. (a) 3, 5 
17. (a) f= (e@—-1)(2@ + 1)(@ — 5)(a@ + 5) 
(c) f = («—1)(a@+ 2)(@ + 8) 
23. (a) 1 (c) 3 
25. (a) 3 (c) 2, —1 (e) None 


34. (c) If uEC let & denote the conjugate of u. Define 6: C{[z] >C by 6[f(x)] = f(0) 
This is a homomorphism (it is evaluation at 0 followed by conjugation) but it is not 
evaluation at a for any a EC. Indeed, if 9 = y, then O6(i) = t= —i while ya(t) =1. 

38. (a) Show that R{x]/P[a] = (R/A)[z] as rings (Exercise 37). Use Theorem 2. 


EXERCISES 4.2 FACTORIZATION OF POLYNOMIALS OVER A FIELD 


4. (a) Irreducible (c) Not irreducible (e) Irreducible 
5. (a) Yes, no, no, no, no, yes, yes 
(c) Yes, yes, no, yes, no, yes, yes 
8. (a) As f is monic, we may assume that both factors are monic (Exercise 6). Hence 
f=(e@—u)(a—v) =2°—(u+v)¢ + uw. Now equate coefficients. 
15. eee eitaitat2 244203420? +a41, 2420942042 2tt+a342n7422+1, 
gtd 
a) 8x44+2 = 3(@ —1)(x« +1)(x — 3)(x + 8) in Zs [cz]. 
c) ie 422+1=(24+1)(2+ 3)(@ +5) in Z,[x]. 
e) at—a2? +a —1 = (a — 1)(x — 2)(2?+3a + 6) in Zy3{a]. 
a) Eisenstein isis with p = 3 
a) f(x +1) = 2+4a°+62?+6a + 2, so use the Eisenstein criterion with p = 2. 
) 
) 


a) If f is irreducible in K[z] it cannot factor properly in Fa]. 

a) Already irreducible * (c) f = (2? + 8x — 1)(x2? — x 4+ 2) 
a) 1 = (42? + 3a 4. 4)f ~ (4a 4+ 2)9 

c)-2=49-4(e8 +2? ~ax-1)f 

a) Let l= mf +kg with m and k in Fla]. If hh = pf and h = qg, then 
h=hmf +hkg = (g9)mf + (pf)hg = (am + pk) fg. 


( 
( 
( 
( 
( 
31. ( 
( 
( 
( 
( 


508 Selected Answers 


EXERCISES 4.3 FACTOR RINGS OF POLYNOMIALS OVER A FIELD 
2. (a) 


+ 0 1 t 1+t x 0 1 t 1+t 
0 0 1 t 1+t 0 0 0 0 0 
1 1 0 1+t t 1 0 1 t 1+t 
t t 1+t 0 1 t 0 ae 1 1+¢t 
1+t]}1+¢t t 1 0 1+¢/0 1+¢ 1+4+¢ 0 
(c) 
x 0 1 t ? l+t 1+#? t+ 14t+?? 
0 0 0 0 0 0 0 0 0 
1 0 1 t t? 14+¢t 14+ ¢4+#2 1444+? 
t 0 t t? 1 t+ 14¢ 14 14t+?? 
t? 0 t? 1 t 1+ t+2#2 144 14+¢4+¢ 
1+t Oo 41+¢ t+? 142 142 t4+2? 1+t 0 


14+ |0 14? 1+t t+? t4+2#? 144 14+? 0 
t+t 0 t+? 14+? 14+¢ 14¢ 140 t4+0 0 
14+t4+2? |0 14+t+# 1444+ 14t+2 0 0 0 14+t+?? 


(e) 
x 0 1 -1 t -t i+t 1-t l+t 1-t 
0 0 0 0 Can?) 0 0 0 0 
1 0 i —1 t --t 14+t 1-t -1l4+t -1-t 
1 0 -1 1  & Sb=6 Sbeh “2t 1+t 
t 0 t —t 0 60 t t -t —t 
-t 0 —t t 0 60 —t —t t t 
14+t |O0 14t --1-t t -t 1-t 1 —1 -1+t 
1-t |O0 4t-¢ -14+t ¢t -t 1 ip Spey -1 
~l+t|0 -1l+t 1-t -t t -1 Sfet 1% 1 
-1-t]0 -1-—-t 14t -t ¢t -1+t —1 al 1-t 
38. P=14+t 
x jo 1 t ? 1+t 1+2¢? t+??? 1l+t+# 
0 0 0 0 0 0 0 0 0 
1 0 j t t? 1+t 14+¢ t+? 14t+¢? 
t 0 t t? 1+t t+t? 1 1l4+t+e? 1427 
t? 0 # L+t t+? 14t4+?? t 140? 1 
l+t 0 i+¢t t+? 14t+? 142 t? 1 t 
1+t? 0 1+?? 1 t ? l+t+t 14t t+t? 
t+t? O t+2t? 1424407 142 1 L+t t 
14+t4+¢? | 014+t4+# 14707 1 t t+ ? 1+t 
5. (a) Zsler]/ (0° - «+ 1) (c) Zaife)/ (x? +2 +1) 


6. (a) Here 2? =t, Idempotents: 0, t, 1, 1-4; nilpotents: 0; units: a+ bt, where 
a#t0f#at+b 
7. (a) 5(-14+t412?) \ 


Selected Answers 509 


14. (a) (2 +t)(x@+t?)(a+t+4+t?), where 8 =1+¢ 
(c) (x —t)(w@ - 1—t)(w@+1-—1), where 8 =t-1 

21. (a) If x?+ax+5 is not irreducible over F, it must have a root uéF. Thus, 
wtautb=0. Take c=2u+a. Then c?=4(u?+ua)+a7=—4b+a?, con- 
trary to hypothesis. 

25. (a) Write polynomials as f = f(x). Then de€ (f)+(g) because d=uf+vg for 
some u,v € F [a]. On the other hand, f € (d) and g € (d) because d is a common 
divisor of f and g. Hence (f) € (d) and (g) C (d), so (f) + (g) C (d). 


EXERCISES 4.4 PARTIAL FRACTIONS 


1 2 
2. - 
(ie a+ae+1 
1 x l-«@ 
(c) F 


EXERCISES 4.5 SYMMETRIC POLYNOMIALS 


2. (a) (y?z?) + (a3 + cyz + 272) + (2? + wz — yz) + (3a — 3y) 
7. (a) r3a3 < 2yr3< xy 22zr3 < x? Z9 
8. (a) f = 8183 (c) i = 818283 — 382 
11. ps = 5? ~ 583sq + 58753 + 55152 — 58184 — 58283 + 585 
12. (a) f = (n— 1)s} — 2nse 
13. (a) 2° —172? —- 142-9 


EXERCISES 5.1 IRREDUCIBLES AND UNIQUE FACTORIZATION 


7. +1 
10. (a) Irreducible (c) Not irreducible 
12. (a) Irreducible (c) Not irreducible 
14. (a) Not irreducible (c) Irreducible 


16. (a) If p~q, suppose p is irreducible. If g= ab in R then p~ ab so p~a or p~b. 
Thus q~ aor q~b, so q is irreducible. The converse is the same. 

23. No. Z(./—5). 

27. Write d= gcdja, gcd(b, c)] and d, = gcd[gced(a, b),c]. Then d divides a and gcd(b, c), 
so it divides all a, b, and c. Thus d divides gcd(a, b) and c, which gives d|d,. Similarly 
d,|d, so d ~ d,. Moreover, this result shows that d divides a, b, and c and that every 
common divisor of a, b, and c divides d. Hence, ged(a, b, c) exists and d ~ gcd(a, b, c). 

31. If m~Icm(a,,...,a,) exists in R then a;|m for each % shows (m) C (a,) for 
each i, and hence that (m) C A where we write A= (a,;)M-+-M(a,). But re A 
means a,|r for each i, so m|r by definition. Thus, r € (m) and we have (m) = A. 
Conversely, if A= (m) then a,|m for each i (because m € (a,)); and, if a,|r for 
each i, then r € A= (m) so ml|r. Thus, m is a least common multiple of the a,. 

35. Use Gauss’ lemma. 

37. If f= ug, u is a unit, write u = ¢. Since f and g are primitive, show a ~ b. 

39. (a) Show that R is a subring of Z[z]. 

40. (a) Show (x) C ($2) C ($a) C.... 


510 Selected Answers 


EXERCISES 5.2 PRINCIPAL IDEAL DOMAINS 


1. 
5. 


13. 
15. 
24, 
26. 


35. 


No, Z[z] in Q[z]. . 

Let A= (a),a #0. If a is a unit then |R/A| =1. Otherwise, by Theorem 4 §3.3, 
let B/A be any ideal of R/A, say B = (b). Show that there are at most finitely 
many such divisors b of a up to associates. 

. (a) Write R= Zp). R is a subring of R because — (2) =m, mm — mm and 
m4 ma) — mnitmin and p does not divide nn’. Thus, R is an integral domain. Given 
@ in R, if p does not divide m then ™ is a unit in R (with inverse =). Conversely, 
if @ is a unit, say TT 1, then mm’= nn’ so p does not divide m (it does not 
divide n or n’), so R*= {€ Rp does not divide m}. 

(b) a = (1+ /—2)b + (—1+ V—2), where 6(—1+ /—2) =3 < 11 = 6(b). 

(b) a = 5b +(—1), where 6(—1) = 1 < 2 = 6(b) 

(a) Z(i)/A = Zo 

(a) (1) > (2). Given a #0, b#0, let (a,b) = (d). Then d=ra+sb for some r, 
s€R,so if kla and k|b in R then k\d. But dja and d|b because a, b € (d). 

(2) = (1). Given A= (a,b), clearly A is principal if a=0 or b=0. Otherwise 
let g =gcd(a,b) ~d where d=ra+sb for r, se R. Then d€ A so (d) CA. On 
the other hand, dja and db so a € (d) and b € (d). Hence (a,b) C (d). 

No. If so, and w = /—2, then -2 = w? > 0. But 2=14+1>0. 


EXERCISES 6.1 VECTOR SPACES 


31. 


. (a) No (c) No 
. (a) Yes (c) No 
. (a) Dependent (c) Independent 


» {(1,-1, 0), (1,1, 1), (a, 0,0)} for anya #OinR 

. I, A, A?, A?, A* cannot be independent. 

. (a) {l,r,...,r"} is not independent because dimp(R)=n. So ap +ayrt+:+++ 
a,r” = 0 where the a; € F' are not all zero. 


~ (b) TE {uy,...,t} C (tgs; Umreer Ua} then S7i", au =0 implies that 
pe Oats = 0 where Qn41 =+'' = Gn =0. Soa; =0 forl<i<m. 
. If vu; is in span{v1,...,j-1, Vit1,.--, Un}, then v; = boy et IY so, writing a; = —l, 


bee a;v; =0, with a; #0. Hence {v4,...,Un} is dependent. Conversely, if 
Dini ivi =0, where some a, #0, then =)),4,(-ajta;)v; is in 
span{vy,...,Ui-1) Vitiy-++y Un}- 

(a) They are additive. subgroups by group theory. If ué€ker gy then 
y(av) =ayp(v)=a0=0 for all acF. If weimy, say w=y(v), then 


aw = ayp(v) = y(av) € imy. 


EXERCISES 6.2 ALGEBRAIC EXTENSIONS 


AEN 


. (a) u4—-16u?+4=0 (c) u® + 2u* + 49 =0 
. (a) ct ~102? +1 (c) at — 2a? —2 
(a) Algebraic (c) Transcendental 


(a) 2? — 22 +2 


21. 


23, 


26. 
31. 


33. 


Selected Answers 511 


. (a) (u-v3)?= (-i)?= —-1 so u2—-2v8u+ 4 = 0. We claim that the minimal poly- 


nomial is m(z) = x?—2V3z +4 € R[z]. Its roots in C are 34% and neither is in 
R, so it is irreducible in R[z], as required. 


. (a) {1,u, u?}, where u = 4/2 (c) {1, u, u?, V3, V8u, V3u?}, where u = 1/3 


(e) {1, V3, v5, VIB} 


. (a) 2 (c) 2 
.if Fu) QLDF write p=(F(u): F)=[F(u): LL: F]. Thus [L: F])=1 or 


p; so L = F or [L: F|] = [F(u): F], whence L = F(u) by Theorem 8 §6.1. 


. Let ue E—Q. Show that f € Q(u) with degree 2 exists such that f(u) =0, 


say f =az’+br+c. Conclude that [Q(u):Q)=2,s0 E=Q(u). We may as- 
sume a, b, and c are integers, so E = Q(Vd). If d=p’e, e€ Z, p a prime, then 
E = Q(pvVe) = Q(,/e). Continue until FE = Q(./m) where m is square free. 

(a) Write D=F(u), so F(u,v)=L(v). Thus Liv) DL DF and [L: F])=m 
by hypothesis, and [F (u,v): F] =[L(v):L]-m. Hence, we simply show that 
[L(v) : L] <n. If p and m are the minimal polynomials of v over I and F, respectively, 
then p|m by Theorem 3, so [L(v) : L] = degp < degm=n. 

If /2€ Q(m) then VI= Fe, fig €Q|z], gcd(f,g)=1. Then h(x) =0, where 
h(a) = 2g9?(x) — f?(«), and this is a contradiction if h#0 in Q[a]. But h=0 
means 29’= f” so, since gcd(f,g)=1, f|2. Thus f=+1, +2, g?(x)=+1, +4. 
This forces deg g=0; g€Q, g=+1, £75. Thus g= +1, Va=t = +1, a 
contradiction. Thus h # 0 and /2€ Q(z) has led to a contradiction. So V2¢ Q(n). 
(a) Let f(u”) = 0, 0# f € F[z]. Use g where g(x) = f(a”) #0. 

(a) We show F(u)=Q, where Q={f(u)g(u)"|f, 9 € Fla]; g(u) #0}. Since u 
is transcendental over F, f(u) #0 whenever f(z) #0. Thus Q is a subfield 
of EF containing F and u, so F(u) CQ. But any subfield of E containmg F and 
u must contain Q, so F(u) D Q. Thus F(u) = Q. 

Show that if u is algebraic over A then uéA, contrary to hypothesis. If 
f(u)=0 where f#0 in Ala], let f=wotwitt---+wrr”, w;,€ A. Show 
that u € L(u) where LD = F(wi,..., Wp) is a finite extension of F. 


EXERCISES 6.3 SPLITTING FIELDS 


13. 


. (a) E= Qiv3), and [E:Q]=2. = (c) E= Q(i,V7), and [E: Q) =4. 
- (a) Q(V3, V5) 
. (a) F=Z,(u), Ww +u+1=0; f(z) = (2@+ D(e@tul(e+1t+u) 


(c) E=Za(u), f(u) = 0; f(z) = (w@t+u)(e+u?)(e+1+u+v’) 
(e) E = Za(u), u2+1=0, f(x) = (e@—u)2(e@+u)? 


. (a) No. If C were the splitting field of f(x) € Q[z] then C = Q(u,,...,u,). Thus 


C D Q would be algebraic, contradicting the fact that 7 or e is transcendental. 


. If ged(f,g) =1 let 1=fht+gk; h,k in Fla]. Suppose HDF is an extension 


containing a common root wu, that is f(u)=0=g(u). Then substitution gives 
1= f(u)h(u) + g(u)k(u) =0, a contradiction. So no such extension E exists. 
Conversely, let d=gced(f,g). If d#1 then deg d>1 so let EDF be a field 
containing a root u of d. Then d|f and d|g means f(u) =0=4g(u), contrary to 
hypothesis. Sod = 1. 

Show that the roots of z?—1 are 1,w,w?,...,w?-t, so Q(w) is the splitting field. 
By Theorem 6, Appendix A, «?—1 = (a —1)®,, where &,= ae ty gP-24 06.4941 
is irreducible over Q by Example 13 §4.2. 


512 Selected Answers 


20. (a) We show A= Q(i). Clearly A DQ is algebraic. We must show that if we # 
is algebraic over Q then ue A. Since u is algebraic over A, we show that u¢ A 
implies u is transcendental over A. We have EF = A(m) so this follows from Exercise 
31 §6.2 if we can show that 7 is transcendental over A. But if 7 were algebraic over 
A it would be algebraic over A D A, contrary to the preceding exercise. 


EXERCISES 6.4 FINITE FIELDS 


1. (a) 2 (c) Any element of GF(8) except 0 and 1. 

4. (a) GF(p") (C) Gre) 
iS | 

GF(p*) GF(p*) GF(p*) 
ee «i | 

GF(p*) GF(p’) GF(p*) 
SX | 

GF(p) GF(p) 


5. If GF(16) = {a+ bt + ct? + dt? | a,b,c,d in Zo, t*=t+1}, then t is primitive. 
The subfields are GF(2*)=GF(16), GF(2)=Z., and GF(2?7)={0}U (4) = 
{0,1,25,2°} = {0,1,¢+¢,1+¢4+2}. 


9. 1G Cc C*, |G| =n, then G = (u), where u = e?7#/", 

17. Let d=gced(f,f’) and write d= fg+ f’h where g and A are in Fiz]. If d=1, 
suppose f has a repeated root a in EDF. Then x—a divides both f and f' 
in E[z], and so divides d=1, a contradiction. Conversely, if d#1, let E be a 
splitting field of f over F. Then d|f implies d has a root a in E, so x —a divides 
f and f', a contradiction by Theorem 3. 

22. (a) If K DZ, is a field containing a root u of f, conclude that f is the minimal 

polynomial of u over Zp. If E=Z,(u) then [E:Z,]=n and so |E|=p". Then 

u is a root of h=2?"—a, so f\h in E[a], say h=qf. But h=qof+r in Z,[a] 

by the division algorithm, so this holds in E[z]. By the uniqueness in E[a], we get 

g = %€ Z,[z] and r = 0 € Z,[z]}. 


EXERCISES 6.5 GEOMETRIC CONSTRUCTIONS 


3. Yes. Bisect 30°, after constructing that from a 30-60-90-triangle. 


5. No. A sphere of radius 1 has volume on and, if this is the volume of a cube with 


side a, then a = (4 a °| But a is not constructible since it is not even algebraic 
over Q. For if a is a root of f(x) € Q[z]. Then = is a root of g(x) = f[$a%]. This 


is impossible as 7 is transcendental over Q. 


EXERCISES 6.7 AN APPLICATION TO CYCLIC AND BCH CODES 


5. (a) In Bysl+t, t+?=t(1142t), ?4+¢8=07(1+t) and 1+¢=13(1+1t). The 
other members 0, 1+#?, ¢+¢? and 1+t+#7+° all lie in smaller ideals. (See 
Example 3.) 


7. 


11. 


17. 


Selected Answers 513 


(a) 1+a7=(1+2)(1+2+2°)(1+27+23) so there are 2-2-2=8 divisors in 
all. Thus, there are 7 codes (excluding (1-+ 2’) = 0). 

Since g(z) =1+2+ 2" has no root in Ze, if it factorizes at all, it must do so as 
g(x) = (a+br+ca?)(a'+¥'e+cx2). Thus ad’=1l=cc so a=ad'=l=c=d. 
Thus g(x) =(1+ba+2”)(1+ba2+27) so (coefficient of z*)b+b'=0 and (co- 
efficient of z) b+ b'= 1. This is impossible. 

(c) Write g(x)=1+a+a°. We have 1+27= (1+a)(1+a7+ a3)g(x), a 
product of irreducibles. Hence, 1+a7= A(x)g(x), where h(z) =1+e4+27+<4. 
We have l=a2g(z)+1-h(x), so take e(x)=2g(x)=x2+2°+2*. Note that 
e(t)?= e(t?) = 7 414428= 244444 = e(t). So e(t)=t+t+t* is the idempo- 
tent generator. 


EXERCISES 7.1 MODULES 


1. 


21. 


(c) Using (a), 2+(-v)=0=02=(1+(-1))2 = 124 (-1)e = 2+ (-l)z, 80 
—x = (—1)e. 


. (a) If a: MN is onto and R-linear, and if M=Rzr1+---+ Ra,, then we 


have N = Ra(x,)+-:+-+ Ra(zx,,). Since some of the a(x;) may be zero, the result 
follows. 


. (a) Ife = La,k; then rz = U(ra)k; € AK 
. Let A= L;Ra,, aie A, and M=2;Rz, «;€ M. Use Exercise 5 to show that 


AM = Yi,j Raiz;. 


. (a) Define a: N ~*E® by a(n) =n+K for all n € N. Show that @ is R-linear 


K 
and ker a= NOK. Every coset in ee has the form (k+n)+K, where k € K 


and né N. But (k+n)+K =n+K =a(n), which proves that a is onto. Now 
the isomorphism theorem applies. 


. (a) Yes. (m,n) = (n,n) + (m—n,0) shows that M = K + X; clearly KNX =0. 
16. 


(a) We have M = 7(M)+ker m because m= n(m)+(m-—7(m)) for each me M 
and n[m—x(m)| =2(m)—1?(m)=0. If mea(M)nker 1m, let m=7(m,) 
with m,€ M. Then 0 = 2(m) = a” (m,) = m(m,) =m, so 7(M)nker 7 = 0. 

(a) If w+ AW =wi+AW; we must show that a(w)+AV =a(w,)+AV. Show 
that w — w,= D,a;w; where a;€ A, w;e W, and apply the linearity of a. 


EXERCISES 7.2 MODULES OVER A PID 


. (c) Za®Z3, Z2PZ202Z3. 
. (a) The types are (4), (3,1), (2,2), (2,1,1) and (1,1,1,1). Hence, representative 


groups are Z,1, Z,3 ® Zp, Z,20Z,2, Z,20LZpOZy and Z,OZ,0Z,0Zp. 


. (a) Zp@Z,2, ZpPLZy@Zq. 
. (a) The types are: p-component (2), (1,1); the g-component (3), (2,1), (1,1, 1); 


and the r-component (4), (3,1), (2,2), (2;1,1), (1,1,1,1). Hence 2:3-5=30 in 
all. . 


. (a) Thus G(2) has type (2,2); G(8) has type (1,1,1) and G(5) has type (2, 1). 
. (a) The types are (2,2, 2), (2,2,1,1), (2,1,1,1,1), (1,1,1,1,1,1). 
12. 


(a) T(K) ={k EK | ofk) #0} =KN{meEM | o(m) 40} =KNT(M). 


514 


16. 


20. 
22. 


27. 


p(Rz) = Rpz is a routine verification, and px = 0 if m = 1 because o(xz) = p 


Selected Answers 


(a) Define ¢: K > M/T(M) by o(k)=k+T(M). This is a group homomor- 
phism and kero = {ke K|keT(M)}=KNT(M)=T(K). Use the isomor- 
phism theorem. 

(c) If m = Lia; then dm = 0 implies dx; = 0 for all i. Hence Lg(M) © Ui; La(Mi). 

(a) Here L,(Rx) = {rz | p(rxz) =0}. If pre =0 then pr €ann(x) =(p™)= Rp™, 
say pr = sp™. Since m>1 and RF is a domain, this gives r = sp”!, and we have 
shown that (Rr)’C R(p”™ *x). The other inclusion is because p™x = 0. Finally, 
(a) L,(G) consists of 0 and the elements of order p. We have Lp(G) = L,(Gi) ®:-:® 
L(G») by Exercise 20, and |£,(G;)| = p for each i by Exercise 26. 


EXERCISES 8.1 FACTORS AND PRODUCTS 


15. 


22, 


. (a) Clearly 4 c82n%. If Kge=n® 


. (a) XY = {e,0,07}. Note: XY is a subgroup here, but X and Y are not. 
; £ <J B as & is abelian. Hence H < G by the correspondence theorem. 


- (a 


a) K, G, (a), {1, a3, b, ba}, {1, a5, ba, ba*}, {1, a3, ba?, ba®}. 
(c) K, Ad 


. (a) pZ, where p is any prime 


(c) If Dio = {1,a,...,a°,b, ba,...,ba2}, where |a]=10, |b] =2, and aba=b, the 
maximal subgroups are 
Hy, =(a ) 
An= 
ez = 


(a?, b) ebay aaa ie shea bas bapa 

(a°,b) = {1,a°, b, ba®} 

= (a® ia) = {1,a°, ba, ba®} 

= (a®, ba”) = {1, a5, ba”, ba} 

= (a®, ba?) = {1, a5, ba®, ba®} 

ae (a nite a®, bat am 

let Kg=Kh, hE H. Then 
HOR | 


aan 
gh"e€K CH, Se. Similarly g € H,, so Kg ¢#0e 


. (a) H? CH because H is closed; H C H? because 1 € “W. 


KA= AK and KB=BK are subgroups by Theorem 5 §2.8. Given kb in KB, 
Ab = bA and Kb = bK by hypothesis, so K A(kb) = AKkb = AKb= AbK =bAK = 
bKA= KbA=kKbA = (kb)KA. Thus KA 4 KB. 

(a) Let a#b have order 2, show that H = {1,a,b,ab} is closed and apply La- 
grange’s theorem. 


EXERCISES 8.2 CAUCHY'S THEOREM 


1, 


7. 


(a) {1}, {a,a3}, {a?}, {b,ba?}, {ba,ba®}. The normal subgroups are the unions 
{1}, {1, a?} and {1,a,a?,a°}, {1,b, a”, ba?} and {1, ba, a?, ba*}. 

Let K=g-'Hg, so H=gKg"'. We claim N(K)=g™'N(A)g. Let a€ N(K) 
so a tKa=K. To show aé€g™N(H)g it suffices to show gag-te N(H). But 
(gag"')"*H(gag™*) = gag"! Hgag"*= ga“! Kag"!= gKg"'= H. HenceN(K) C g™* 
N(H)g. A similar argument shows that N(H) C gN(K)g7' so g-.N(H)g C N(K). 


11. 


14, 
16. 
25. 
29. 


Selected Answers 515 


Because H C N(H) CG, Exercise 31 §2.6 shows that |G: N(H)| is finite. Hence, 
Theorem 2 applies. 

We have a! Ha = {1,ba”} so ba?¢ N(H). Continue in this way. 

N(y) = (9). 

H is a union of conjugacy classes. Show that there exists a # 1 such that {a} C H. 
Since CG, let Z[G/C] = K/C. Since |G/C| >1, Theorem 6 shows CCK. 
But K<G so K ZH by Exercise 23 §2.8. If ke K then kC is in the center 
of G/C, so h'khké€C dH. Hence k!Hk CH, and similarly kHk'C H. 
Thus k € N(H) and we have shown K C N(H). 


EXERCISES 8.3 GROUP ACTIONS 


1. 


10. 


15. 


17. 


23. 


28, 


32. 


(a) By Cauchy’s Theorem let a € G, |a| = 5. If H = (a), then |G- H| = 4, so there 
is a homomorphism 6: G — Sy with ker@ C H. Then ker@ # {1} because |G| = 20 
does not divide |54| = 24. Because H is simple, ker? = H, so H dG. 
(a) HoC H because H=1¢(H). If r€aut G then r'o €aut G for all oe 
aut G, so HpC1r7*o(H). Thus r(H,) C o(H) for all o, so 7(Hy) C Hy. Simi- 
larly t~'(Hy) C Hg, whence r(H,) = Hy. Thus Hp is characteristic in G. 
If o= (ky ko ++-)(my mg +++)(ny ng +++)+++, the orbits of the group G in 
X, are G+ky ={ki,ko,...}, G+m, = {m1,me,...}, Gen = {n1,M2,...}, 00 
Clearly, G.k = {k} if and only if k is fixed by o. 
(a) =a becauser=a-1;ifs=y, say y=r-a,a€G, thnz=y-a",soy=za; 
ife=yand y=z,say y=a2-a,z=y-b, then z=(r-a):b=a- (ab), soz=a. 
(a) If a, b€ S(x) then (ab)-r=a-(b:2)=a-r=2, so ab€ S(G). Similarly, 
at.g=a-(a:2)=(a'-a):¢=1-c=2, so a te€S(G). Finally l-c=a2 
shows 1 € (x), and we are done. 
If X ={H CG|H is a subgroup and |H|=p*}, let G act on X by conjugation. 
Use Theorem 4. 
(a) (1,1)-¢= lal ~*= a, and 

(hy, 1) ((h, kb) -@)) = hy (hak"")ky*= (hh, kyk) «a = [(Ayki) + (A, k)] 
The orbit is 
(Hx K)-2={(h,k)-e|heH,ke K} = {hak [he H,ke K} = Hak. 


1 


EXERCISES 8.4 THE SYLOW THEOREMS 


1. 


If P is a Sylow 3-subgroup, then P = (7), where + is a 3-cycle, say y= (i 7 k). 


If o= C : a) , where {1,2,3,4} = {t,j,k,2}, then o(1 2 3)o-1= 7. 
a 
Hence o ((1 2 3))o07! =P, so P is conjugate to((1 2 3)). 


. P is a Sylow p-subgroup of N(P), being a PAGDELCUP of maximal order. It is unique 


because it is normal in N(P). 


. (a) |G] = 40 = 23-5. Thus, ns = 1,2,4,8, and ns = 1 (mod5). Hence, ns = 1, so the 


Sylow 5-subgroup is normal. 
(c) |G| = 48 = 24-3. If P is a Sylow 2-subgroup then |G’: P| = 3, so a homomorphism 
6:G— So exists. Clearly, ker @ # {1}. 


516 


9. 


11. 


13. 


19. 


Selected Answers 


(a) |G] = 70 = 2-5-7. Then ns = 1,2,7,14, and ng = 1 (mod 5), so ns = 1. Similarly 
nz = 1, so let Pd G and Q 1G, where |P| = 5 and |Q| = 7. Because PNQ = {1}, 
PQ2=PxQSCs x Cy = Css. Hence, |G: PQ| =2, so PQ 4G. 

(a) |G|=105=3-5-7. Then n7=1,3,5,15, and n7=1 (mod7), so n7=1, 
15. Similarly, ns = 1,21. Let P and Q by Sylow 7- and 5-subgroups. If neither is 
normal in G, then G has 21.4 = 84 elements of order 5 and 15-6 = 90 elements 
of order 7, a contradiction. So Pd G or Q<G; hence PQ is a subgroup, and 
|PQ| = |P||Q| = 35 because PNQ={1}. As |G: PQ|=3, let 0:G—4S3 be a 
homomorphism with ker@ C PQ. Then | ker 6| #1,5,7, so PQ = ker@ 4G. Finally, 
P<aPQ and Q < PQ by the Sylow Theorems, so PQ = P x Q & C7 x Cs & C3. 
Let P be a Sylow p-subgroup of G. Since p > m we have |P| =p", so |G: P| =m. 
Apply Theorem 1 §8.3. 

If Q is also a Sylow p-subgroup of G, then Q = a-*Pa by Sylow’s second theorem. 
If g€ N(Q) then Q=g™'Qg; that is a-}Pa=g™'a-!Pag. This implies that 
aga-'e N(P) = P, whence g€a'tPa=Q. 


EXERCISES 8.5 SEMIDIRECT PRODUCTS 


1. 


(a). Write o=(12) eS, and H=(o). Then A,CA,HCS, As S,/A,= Co, 
either A,H=S, or AnH =A,. Since o € A, we have S,=A,H. Similarly, 
A, NH # {e} means A,NH = H (because H is simple), contradicting h ¢ An once 
more. Hence A,NH = {e} and the result follows from Theorem 2. 


. This is an instance of Theorem 3 (3), where p=3 and g=13. We have q=1 


(mod p), so we look for m such that 1<m<12 and m'=1 (mod 13). If m=1 
then G = C13xC3& Css. The first solution with m > 1 is m= 38, whence G (a,b) 
where o(a)= 11, o(b)= 3 and ab = ba?. 


_ EXERCISES 8.6 AN APPLICATION TO COMBINATORICS 


6. 
8. 


(a) 97(@? +11) 
(a) $9(q +1)(q* — g? +g? + 2) 


EXERCISES 9.1 THE JORDAN-HOLDER THEOREM 


1. 
3. 


8. 


11. 


(a) 3; C2, Co, Ce (c) 3; C2, C2, C2 (e) 3; C2, C2, Cy 
(a) If H;, is the unique subgroup of order k in Co4, the series are 


Coa D Hig D Ha D He D {1} 
Coa D Aig D He D Hz D {1} 
Or, D Hy. D He D He D {1} 
Crug D Hg D Hg D He D {1} 


(a) Let n=p,po°+p,,, Where the p; are distinct primes. Then C, has length 
1+1+-:-+1=m by Example 8. 

Induct on n. If n=1 then G= GoD Gi= {1} so G2 G)/G, is finite. In gen- 
eral, G; is finite by induction, and G/G,=Go/G, is finite by hypothesis. Thus 
G consists of |G/G,| cosets, each with |G,| elements. Hence G is finite. Now 
|G| = |G,/G,|-|G,], and the formula follows by induction. 


15. 


Selected Answers 517 


(a) If M CC,, is maximal normal, then C,/M has order a prime q (being simple 
and abelian). Hence, q = p; for some 7 because q divides |C_,| = n, Thus, |M/| = 
for some i = 1,2,...,r. Since C;, is cyclic, it has exactly one subgroup of order & by 

Pi 
Theorem 9 §2.4. 


EXERCISES 9.2 SOLVABLE GROUPS 


15. 


19. 


21. 
23. 


27. 


. No, Z(S4) = {e}. 
- No. Sy is solvable (Example 4) but $= A, is not abelian. Indeed SC A, because 


S4/Aq is abelian. Thus Sy= Ag, {e} or K = {e, (1 2)(3 4), (13)(2 4), (1 4)(23)}. 
But S4/{e} and $,/K are not abelian (see Exercise 30 §2.9). 


. (a) This is because aa, b] = [a(a), a(b)] for every commutator [a, b] from G. 
. By Exercise 14 §8.4, let KaG, K #{1},G. Then both |K| and |G/K| are in 


{p, 4, p*, pq}. Hence, both are either abelian or of order pg and thus are solvable, Use 
Theorem 4. 
If G is solvable and G= GoD GiD-+: D Gy= {1} is a composition series, each 
simple factor is abelian and hence finite. Hence, |G] = 3 . =| vee | Sel | is 
finite (see Exercise 11 §9.1). The converse holds because every finite group has 
a composition series. 

HK is a subgroup as K 4G, and 2&2, is solvable by Theorem 3 (H is 
solvable). Done by Theorem 4. 

(a) Because G # {1}, G’ # G by Theorem 5. Thus G/G’ is nontrivial and abelian. 
(a) Write {K <G|G/K solvable} = {K,,Ko,..., Km}. This set is nonempty 
as it contains G. Then R = 72, K; is normal and G/R is solvable by Exercise 18. 
If K <«G and G/K solvable, then R C K by definition. 

(a) Write V = WG). Then V dG because the intersection of normal subgroups 
is normal. Note that the intersection is not empty because G<G and G/G is in 
v. If V= Kinken:::-NK, then G/V embeds in Exo xe (as in Exercise 
18) and -* it xe is in V by induction because V is closed under taking direct 
products. Hence G/V is in V, being isomorphic to a subgroup of a group in V. 


EXERCISES 9.3 NILPOTENT GROUPS 


2. 


6. 


9. 


13. 


If HAG and K<AG then a“ [h,kla=[a'ha,a'ka] € [H, K] for all he H 
andke K. 

(a) By induction on n, it suffices to show that P,(G x H) CT.(G) x P(A). Do so by 
induction on i. If i = 0, then [p(G x H) = Gx H =Io(G) x To(A). If the relation 
holds for i > 0, then [z41(G x H) = ([i(G x H),G x H] C [0s(G) x Ti), G x HI, 
so it suffices to show that, if AC G, B C H, then [A x B,G x H] C[A,G] x [B, H]. 
This outcome follows because [(a, b), (g, )] = ((a, 9]; [b, A]). 

If n= 2* then |D,,| = 2**? so D,, is nilpotent by Example 3. Conversely, suppose 
n=2*m, m>1 odd. Show that (a?" ,b) =D, so D, is nilpotent by Theorem 
1. But in this case {1,0} is a Sylow 2-subgroup that is not normal, contradicting 
Theorem 4. 

Kn Z(G) # {1} by Exercise 11. Thus, KN Z(G) =K by the condition on K, 
Now every subgroup of K is normal in G, so |K| is prime. 


518 


18. 


21. 


24. 


26. 


Selected Answers 


(a) A is itself nilpotent by Theorem 1 so, by Theorem 4, H is a product of p-groups. 
Now apply Theorem 6 §8.2. 

(a) If H = (a?) then H and H are maximal. 

(c) If H = (a?) and H = (a?) then H and K are maximal, as is (a”). 

(1)=+(2). We show that G’C ®, that is G’C M for every maximal subgroup M of 
G. Show that this follows by (1) because M 4G. 

(a) Write ®(G) = ® and ® (€) =£, where F< G. If M is maximal in G then 
K CM (because K C®CM) and ¥ is maximal in 2. Hence £ C # whence 
FCM. It follows that F C ®. Conversely, lei a:G-—»G/K be the coset map. 
Then a(®) C ®(G/K) by the preceding exercise, so s€® implies cK €£; so 
céF. Ths Ocr 


EXERCISES 10.1 GALOIS GROUPS AND SEPARABILITY 


21. 
22. 
23. 
25. 
27. 


29. 


30. 


32. 


e, 0! and or all fix F when o and 7 do. 


. o(La;v;) = Dazo(v;) because o(a;) = a; for a; € F. 
. Ifo: E- E is an automorphism, show that o(g) = q for all q EQ. 


C2 


. Co x Co 
. Show that E = F(u) ifue EW F. If m is the minimal polynomial of u over F, show 


deg m= 2. 


. Construct o and 7 in gal(E: Q) with o(u) = iu, o(¢) = 4, and r(u) =u, r(i) = -4. 

. (a) v= V3 and w = V5 in Theorem 6. 

. v= ,/p and w = ,/q in Theorem 6. 

. See the Hint. 

. We proceed by induction on n. If n=1 then E = F(u,) = {f(u,) | f(z) € Fle]}. 


Hence o(f(u,)) = g(o(u,)) = g(r(u,)) =7(f(u,)) for all f, as required. In gen- 
eral, write K = F(u,,U9,...,U,-1) so that E=K(u,). By induction, o=7 
on K, so o,7 € gal(K: F). Since o(u,,) =7(u,,) the result follows from the case 
n=l. 

See the Hint. 

(a) For (3)=-(1), if f has a repeated root uin BD F, let 1 = fg+ f’h in Fla] by (3). 
If d= ged(f, f’), show d= 1. 

If BD F and gq is an irreducible factor of f in E[z], write f = pipe:+-p, in F(a], p: 
irreducible, and show that q|p; for some i. 

If not, and u is a root of f in a splitting field BH D F, show f = (x — u)? in E[a}. If ¢ 
is an irreducible factor of f in F[z], show g = (2 — u)’. 

(a) If F is perfect, and a € F, let E be the splitting field of f =z? —a. Ifue E is 
a root of f show f = («—u)?. If g is an irreducible factor of f in F[z] show that 
gq = 2a-—u. Use Theorem 4. 

(a) Let g be the minimal polynomial of u over F. If K = F(u?) let me K[c] 
be the minimal polynomial of u over K. Then g€ K[z] and q(u)=0, so mig. 
But q has distinct roots by hypothesis, so m has distinct roots. On the other hand, 
x?—uPé K[x] and x?—-uP=(a—1u)” in Elz]. Hence m|(z—u)? so m= (x—1u)’. 
Since m has distinct roots, r= 1 andsouwe K. 

(a) Let p and qg be the minimal polynomials of u over F and K respectively. Then 
p€K[z] and p(u) =0, so g|p. Since p has distinct roots in some splitting field 
LD K, q is separable over K. 


Selected Answers 519 


EXERCISES 10.2 THE MAIN THEOREM OF GALOIS THEORY 


1. 


(a) By Example 4 §10.1, gal(Z,Q) = 2 


(oc) =C4, where o(u)=u?. If H= av 
(co?) then H° is the only intermediate eo) 
field (except Q, F). H° = Q(u+ u%). | 


(c) By Exercise 9 §10.1, gal(Z,Q) = 


(0,7) &Cgx Co, where o(t) =—i, 
o(V3) = V3; and t(i) =4, (v3) = eo 


—V3. If H = (c) and H, = (r), the lat- 7 o 

tice of fields is as shown. H°® = Q(V3); NS Q oe 

Hz = Qi). 

(ec) Ifu= 2, then E = Q(u,i) and, by Exercise 13 §10.1, gal(E : Q) = (o,r) & Da, 
where o(u) = iu, o(i) =i; and r(u) =u, T(i) = —7. The lattice diagram is shown. 


Primitive elements are FE = Q(u+i), (o7)° = Q(u? +7), (7)? =Q(u), (0)? = Q(i), 
(ro)° =Q(u-iu), (07,7)? =Q(u?), (ro?) = Q(iu), (7, To)° = Qliu?), and 
(ro%)° = Q(u + iu). 


. (a) Either G = C,2 = (¢): 


or GC, x Cp & (0,7): 
E 
ae. 
{o)° (r)° 


Sua 


. (a) Let r=r(t)= in € FE show f(t)g(—t) = f(—t)g(t). If char F #2, write 


h(t) = f(t)g(—-t). Show that h(t) = k(t?) for some polynomial k Similarly and 
g{t)g(—t) = l(t?) for some polynomial 1. Continue. 


. Clear. 

. (a) An intermediate field K is closed if K = K’’. 

. Exercise 34 §2.6. 

. (a) Use the Galois connection. 

. If K =o0(Ki) show oKjo™ = K' Ifo Kio = K show K; Ca} (K), so 0(K1) CK, 


and similarly K C o(K)). 


. (a) Let B= F(ui,ue,...,Um) where uz,...,Um are the distinct roots of f, use 


Theorem 3 §10.1. 


520 


20. 
21. 


Selected Answers 


(a) Apply o to the formulas for N(u) and T(u). 
If f =v +uet+-+:+Ume™ show f* =] ],.o[2 —7o(u)] = f. 


EXERCISES 10.3 INSOLVABILITY OF POLYNOMIALS 


» (a) QVv3, V5, 77) 
. (a) f'=524—4 has roots ta and tia, where a= 4/4/5. Then f(a) <0 and 


f(-a) >0, so f has three real roots and two (conjugate) nonreal roots. As f is 
irreducible (Eisenstein), its Galois group is Ss, as in Example 1. 


. Show that p = x” — 14x + 2 has three distinct real roots and two (conjugate) complex 


roots. If EF D Q is the splitting field, view G = gal[H : Q] as a subgroup of Sy where 
X CC are the roots. Then conjugation is a transposition and, if wu is a real root, 
then [Q(u) : Q] = 7 because p is the minimal polynomial of u over Q. Proceed as in 
Example 1. 


. Let X denote the set of roots of p in a splitting field HE D F where p € F[z]. View 


G = gal(E': F) C Sx, so G embeds in Sy. 


. Since f’ = 3(a? —1), conclude that f has three real roots. In the cubic formula, p* 


and q° are roots of x? + x + 1 which satisfy p* + g? = —1 and pq = 1. The roots are w 
and w? (w = e?"*/3). Show that p = e?**/9 and g = el"#/9 = e- 27/9 — 5. The roots 
are 2 cos (2%) ,2 cos (8) and 2 cos (4) . Finally gal{E : Q| & Cs. 


. (a) o(A*) = A? for all o € G because o permutes the roots u;. 


EXERCISES 10.4 CYCLOTOMIC POLYNOMIALS 
AND WEDDERBURN'S THEOREM 


1. 


3. 
5. 


(a) a¢#+1 (c) a# —2? +1 (e) © —23 +1 

Use the Hint and induction. 

If wy =e?™/*, these fields are Q(wmn) and Q(wm,Wn) respectively. Show that 
Q(Wm,Wn) = Q(Wmn). (DB requires ged(m,n) = 1). 


. Write a(n) = Dodi p(d). lin = py'p,? ++: pe and m = pipe: p,, then a(n) = a(m). 


If d|m, show p(d) = 1 if and only if d is the product of an even number (possibly 0) 
of the p;, and 44(d) = —1 otherwise. 


dln el(n/d) edln 


- (a) dude (5) =2 ud) > a = >> u(d)B(c) 


= LA) s ua) = B(n) 


e|n d\(n/c) 
by Exercise 7. 


EXERCISES 11.1 WEDDERBURN’S THEOREM 


. Show that 4 = Re and use Theorem 1 §7.1. 

. Straight forward. 

. If L C eRe is a left ideal, consider RL. 

. Use the Corollary to Lemma 2 and Theorem 5 §7.1. 

. (a) Show there exists a minimal member of S = {ann(X) | X C M and X is finite}. 


Selected Answers 521 


9. Let X = es m |m € Zand p does not divide m}.IfZ CY Cc X, where Y is a subgroup 

of X, doer that there exists 2 € Y with n maximal. Then show that Y = Lee 

11. (a) Ifa € M show that 2 — me) € ker 7. It follows that M = 1(M)-+ ker 1. Continue: 

13. (a) If K=K,@Ko@::- then K > Ke@K3@-:-DK,@Kyd°:: and Ki, CK 
ky C Ki @Kkye@K3c-:: 

15. Use the Hint. 

17. (a) Show that ker(a) C ker(a?) ¢ ker(a3) C... and apply the noetherian condition. 

19. (a) To see that 6 is multiplicative, let O(r) = [rij] and @(s) = [siz], so that 
iia i a1 Tijy and ws = He $ijUj for each i. Compute wirs. 


EXERCISES 11.2 THE WEDDERBURN-ARTIN THEOREM 


2. (a) Show that R= L ® M for some left ideal, and i = e + f where e € L and f € M. 
3. (a) i axioms are routinely verified. 
5. Each 44 is simple. 
7. Use the preceding exercise and Lemma 3 811.1. 
9. Show that R is simple as a left R-module. 
11. The left ideals of the ring R/A are simultaneously. left R-modules and left 
R/A-modules with the same action. 
13. (2) > (1) £042 € Re show that raz £0, a € R. Show that Rz = Re. 
15. Use Theorem 1(1) §11.1. 
16. (a) (ML)? = MLML. (b). (Ar)? = ArAr. 
17. Use Exercise 4. 
19. Use Lemma 3 §3.3. 
20. (a) Use the definition of domain. 
21. Use Lemma 9 and Theorem 3. 
22. (a) and (c) Use Schur‘s lemma. 


EXERCISES A COMPLEX NUMBERS 


1. (a) o=38 (c) f=0,%=4 — 
2. (a) 7-91 (at is (6) -i (e) ~4 — - 
3. (a) 1— 3% ea (c) 243i > 
5. (a) Unit circle (c) Line y= a (e) frER| re Oo} 
10. (a) 3/2e-t/4 (c) 2572/6 (e) is? 
13 
11. (a) -3 (c) ~V2 + V2i (e) a srs 
12. (a) =e W3i (c) 16 ie 
14. (a) + Fall +4), +(1 ~i) (c) 34, “(v3 4), x V3 - i) 


19. If f(x) = 2 + zie + 2227 +-:-+2z,2", the ae z* in Fla) Fle) is aokk 
2 Epa bes ep-1%1 + 22 = = teas + Zp%) + (z1Zp-1 + 2-121) +° ++ 24/22k/2 
where the last term is real but missing if k is odd. Each of the other summands i c 
also real, being a complex number plus its conjugate. 


522 Selected Answers 


EXERCISES B MATRIX ARITHMETIC 


. Use AT. 

. (a) Use the definition of matrix multiplication. 

. Compute. 

. IT and —J. 

. In general, show (A+ B)(A— B) = A?+AB-— BA-— B?. 

Art At. 

11. (a) In general, if AC = I = C/A then A is invertible and A“! = C. 
(c) If + BA is invertible, compute (I + AB)(I — A(I + BA)71B). 

13. (a) Use Theorem 7. 

15. (a) Use the definition of matrix multiplication. 

(c) aij Ei; has ai; in the (i, j)-entry and zeros elsewhere. 


ON AWN 


EXERCISES C ZORN’S LEMMA 


2. (a) Let S={X CM |X is a submodule and KNX =0}. Then S is nonempty 
because 0€ S, so let {X,|+¢€J} be a chain from S and put U = Uier Xj. It is 
clear that U is a submodule, and KNU =0 because KNU C KN X;=0 for each 
i. Hence U is an upper bound for the chain {X,|i¢€ JI}, so S contains maximal 
members by Zorn’s lemma. 


Abel, N.H. (1802-1829), 69, 202, 377, 397, 
413, 482, 435 

abelian group, 77 

finite p-groups, 341 

fundamental theorem, 346 

primary decomposition theorem, 339 
absolute value, 201 
absolute value of a complex number, 473 
action of a mapping, 10 
additive notation, 71 
al-Khowarizmi, M. (c.825), 202 
Alexandroff, P.S. (1896-1982), 196 
algebra, 448 

regular representation, 457 
algebraic closure, 289, 296 
algebraic element, 283, 286 
algebraic numbers, 289, 296 
algebraically closed field, 296 
alternating group, 62, 78, 128 
alternating polynomial, 245 
annihilator, 187, 456 

in a module, 327, 337 

in a ring, 182, 314 
annihilator ideal, 182° 
Archimedes of Syracuse (272-212 BC), 

224, 307 


Index 


Artin, E. (1898-1962), 160, 422, 447, 448, 
466 

artinian, left, 450 
associative law, 70 

for composition, 13 

general, 72 
automorphism 

Frobenius, 178 

inner, 104, 105, 140, 168 

of rings, 167 
automorphism group, 104 
axiom of choice, 489 
axiomatic method, 3 
axioms, 3 


Bézout domain, 272 
Bézout, E. (1730-1783), 34, 272 
basis 
of a module, 330, 462 
of a vector space, 278, 458, 487 
standard, 278, 331, 463 
Bass, H. (1932-), 468 
BCH codes, 319 
bijection, 11, 54 
bijection theorem, 22 
bijective mapping, 11 


Introduction to Abstract Algebra, Fourth Edition. W. Keith Nicholson. 
© 2012 John Wiley & Sons, Inc. Published 2012 by John Wiley & Sons, Ine. 


523 


524 Index 


binary operation, 70 
associative, 70 
closed under, 70 
commutative, 70 
componentwise, 80 
unity for, 70 
binomial 
coefficient, 26 
theorem, 27, 163 
bits, 144, 310 
Boole, G. (1815-1864), 170 
boolean ring, 170 
Brauer’s lemma, 465 
Brauer, R. (1901-1977), 465 
Burnside’s lemma, 383 
Burnside, W. (1852-1927), 383, 399, 405 
Burnside-Wielandt theorem, 405 


cancellation, 81 
Cantor, G. (1845-1918), 297 
Cantor, G. (1845-1918) , 284 
Cardano, G. (1501-1576), 412, 435 
cartesian product, 8 
casting out nines, 46 
Cauchy’s theorem, 360 
Cauchy, A.L. (1789-1857), 360, 362, 383, 
432 
Cauchy-Frobenius lemma, 383 
Cayley table, 71 
Cayley’s theorem, 106 
extended, 365 
Cayley, A. (1821-1895), 69, 71, 106, 202, 
362, 447 
center 
of a group, 87, 123 
of a ring, 165 
central series, 402, 403 
central series of a group, 404 
ascending, 402 
descending, 403 
centralizer, 89, 131, 359 
chain condition 
ascending, 448 
descending, 448 
on principal ideals, 255 
character of a group, 422 
Dedekind’s lemma, 422 
independent, 422 
characteristic, 193 
characteristic of a ring, 163 
characteristic subgroup, 124, 130 
Chevalley, C. (1909-1984), 467 


Chinese remainder theorem, 48, 98, 195 
circle group, 77 
class equation, 359 
code 
(n,k)-code, 148 
BCH, 319 
binary, 143 
check bits, 145 
coset leader for, 149 
cyclic, 311, 312 
error correction, 147, 148 
error detection, 147, 148, 317 
Hamming (7,4), 151 
idempotent generator, 322 
matrix description, 151 
matrix generator, 315 
maximum likelihood decoding, 
144 
minimal generator, 312 
nearest neighbor, 146 
orthogonality theorem, 155 
parity check, 145 
perfect, 148 
polynomial/word form, 311 
standard array for, 150 
standard generator matrix, 
152 
syndrome, 154 
syndrome decoding, 155 
words, 144 
codomain of a mapping, 10 
Cohen, P. (1934-), 489 
combinatorics application, 382 
commutative law, 70 
commutative ring, 160 
simple, 184 
commutator, 134 
subgroup, 134 
complemented module, 459 
completing the square, 47 
complex number, 6, 471 
absolute value/modulus of, 473 
conjugate of, 473 
DeMoivre’s theorem, 476 
distance formula, 477 
exponential form, 474 
inverse of, 473 
modulus, 473 
multiplication rule, 475 
operations, 472 
polar form of, 475 
real/imaginary part, 472 


roots of unity, 476 
triangle inequality, 477 
complex plane, 471 
imaginary axis, 472 
real axis, 472 
component of an n-tuple, 8 
componentwise operation, 80, 160 
composite of mappings, 12 
composition factors, 390 
composition length, 390 
composition series 
group, 389 
module, 451 
conclusion, 2 
congruence modulo n, 43, 48, 50 
conjugacy class, 358 
conjugate elements, 358 
conjugate in quadratic integers, 
268 
conjugate of a complex number, 
473 
conjugate subgroups, 88, 105, 123, 
358 
constructible number, 305 
containment 
proper, 5 
contradiction, 2 
contrapositive, 5 
converse, 3 
convolution, 249 
core of a subgroup, 131, 364 
corner ring, 165, 454 
correspondence theorem, 353 
coset 
double, 117 
left/right, 109 
map, 132 
coset decoding, 149 
coset leader, 149 
coset map, 132 
counterexample, 3 
cubic formula, 435 
cubic polynomial, 205 
cycle, 58 
decomposition theorem, 60, 62 
length, 58 
structure, 60 
cycle index, 385 
cyclic codes, 311, 312 
cyclic group, 82, 91, 94 
fundamental theorem, 95 
of order n, 82 


Index 525 


cyclic subgroup, 91 
generator of, 91 
cyclotomic polynomial, 221, 442, 445 


D’Alembert, J. LeR. (1717-1783), 
324 
De Morgan, A. (1806-1871), 24, 412 
Dedekind’s lemma, 422 
Dedekind, R. (1831-1916), 159, 181, 185, 
196, 252, 271, 297, 422 
Dedekind-Artin theorem, 424 
degree of a polynomial, 205 
degree over F, 285, 414 
DeMoivre’s theorem, 476 
DeMoivre, A. (1667—1754), 476 
density theorem, 467 
derivative, formal, 299 
derived 
series, 396 
subgroup, 134 
higher, 396 
Descartes, R. (1596-1650), 8, 
202 
Dickson, L. E. (1874-1954), 398 
dicyclic group, 375 
difference of two sets, 7 
dihedral group, 113, 121 
dimension of a vector space, 
279 
Dirac, P.A.M. (1902-1984), 349 
direct product, 80 
direct sum 
characterization, 328 
external, 325 
internal, 329 
Dirichlet, P.G.L. (1805-1859), 40 
discriminant 
of a cubic, 435 
of a polynomial, 442 
of a quadratic, 217, 434 
disjoint sets, 18 
divisible group, 335 
division algorithm, 266 
for integers, 32 
for polynomials, 207 
division ring, 166, 444 
divisor, 33, 221, 252 
common, 33 
greatest common, 33, 221 
domain, 172 
Ore, 177 
domain of a mapping, 10 


526 Index 


Doyle, A.C. (1859-1930), 67 
duplicating a cube, 304, 307 


Hisenstein criterion, 220 
element of a set, 5 
elementary divisors, 346 
elementary symmetric polynomial, 242, 
309, 439 
elements of sets, 5 
embedding theorem, 177 
empty set, 6 
endomorphism, 452 
endomorphism ring, 452 
equivalence 
afforded by a partition, 19 
class, 17 
kernel of, 18 
logical, 3 
natural mapping, 19 
quotient set of, 19 
relation, 17 
Euclid of Alexandria (ca 330-275 BC), 35, 
37, 40 
euclidean algorithm, 35 
euclidean domain, 270 
Euler function, 114 
Euler, L. (1707-1783), 40, 114, 202, 362, 
435 
evaluation map, 208 
exponent laws, 72, 74, 81 
extension of fields, 283, 291 
abelian, 434 
algebraic, 283, 289 
closures in, 426 
cyclic, 434 
dimension of, 283 
F-automorphism, 413 
finite, 283, 287, 288 
Galois, 427, 441 
Galois group, 413 
intermediate field, 425 
main theorem of Galois theory, 
429 
norm, 434 
normal, 298 
primitive element theorem, 
419 
radical, 435 
separable, 418, 419 
separable closure, 422 
simple, 284, 302, 419 
trace, 434 


factor group, 131, 132 
factor modules, 327 
factor ring, 181 
ideals in, 183 
factor theorem, 209 
factorial, 26 
factorization, 252 
trivial, 252 
FC-group, 363 
Feit, W. (1930-2004), 129, 399 
Fermat’s theorem, 50 
Fermat, P. (1601-1665), 1, 50, 51, 115 
Ferrari, L. (1522-1565), 412, 435 
Ferro, S. (1522-1565), 412 
field, 49, 166 
algebraic closure, 289 
algebraic element, 283 
algebraically closed, 296 
Dedekind’s lemma, 422 
Dedekind-Artin theorem, 424 
extension, 283 
finite, 298 
Galois, 300 
minimal polynomial of an element, 285, 
A414 
of algebraic numbers, 289 
of constructible numbers, 305 
of quotients, 177 
perfect, 421 
prime, 194 
splitting, 292, 418 
transcendental element, 283 
field of quotients, 177 
figure, 117 
motion of, 117 
finite field, 298 
cyclic unit group, 302 
existence, 300 
primitive elements, 302 
subfields, 301 
uniqueness, 300 
finite subgroup test, 87 
finitely generated group, 96 
module, 330 
Fitting’s lemma, 457 
Fitting, H. (1906-1938), 408, 457 
Fitting;s theorem, 408 
fixed field of a group, 424 
four square identity, 175 
Fourier, J.B.J. (1768-1830), 432 
Frattini argument, 373 
Frattini subgroup, 406 


Frattini, G. (1852-1925), 373, 406 
free module, 331, 462 
Frobenius 
automorphism, 178, 191, 299, 300 
endomorphism, 191 
theoremj 371 
Frobenius} G. (1849-1917), 371, 372, 377, 
383, 399, 447 
fully invariant submodule, 461 
fundamental identities of a mapping, 14 
fundamental theorem 


finitely generated modules over a PID, 345 


for finite abelian groups, 346 

for vector spaces, 278 

of algebra, 309, 471 

of cyclic groups, 95 

of Galois theory, 429 

of symmetric polynomials, 242, 309 


Godel, K. (1906-1978), 489 
Galois connection, 426 
Galois extension, 427, 441 
Galois field, 300 
Galois group of a polynomial, 436 
Galois group of an extension, 413 
Galois theory 
main theorem, 429 
Galois’ criterion, 436 
Galois, BE. (1811-1832), 69, 202, 298, 413, 
432, 435, 436, 438 
‘Gauss, C.F, (1777-1855), 164, 216, 218, 
224, 251, 252, 261, 308, 362 
formula, 25, 32 
Gaussian integers, 164 
ged, 33, 34, 36, 39, 258 
integers, 36 
polynomials, 222 
general associativity, 72 
general linear group, 80 
generator of a cyclic group, 82, 91 
generators 
of a field extension, 284 
of a group, 78, 100 
of a subgroup, 96 
Goldbach, C. (1690-1764), 40 
Goldie, A.W. (1920~2005), 468 
Grassmann, H.G. (1809-1877), 159, 447 
greatest common divisor, 33, 39, 258 
group, 76 
abelian, 77 
actions (G-sets), 365 
alternating, 62, 78, 128 


Index 527 


automorphism, 104 
Burnside-Wielandt theorem, 405 
Cauchy’s theorem, 360 
Cauchy-Frobeinius Lemma, 383 
Cayley’s theorem, 106 

center of, 87 

central series, 404 

character, 422 

circle group, 77 

class equation, 359 
composition series, length, factors, 389 
core of a subgroup, 131, 364 
correspondence theorem, 353 
cosets of a subgroup, 109 
cyclic, 82, 94 

dicyclic, 375 

dihedral, 113, 121 

direct product, 80 

divisible, 335 

extended Cayley theorem, 365 
extension problem, 388, 391 
factor group of, 132 
FC-group, 363 

finite abelian p-groups, 341 
finite p-groups, 338 

finitely generated, 96 
Frattini subgroup, 406 
G-set, 365 

general linear, 80 

Hall’s theorem, 399 
holomorph, 357 
homomorphism, 99 

image, 352 

isomorphic, 82 

isomorphism, 102 
isomorphism theorem, 138 
Jordan~—Holder theorem, 390 
Klein, 83 

lattice diagram, 87 

maximal normal subgroup, 355 
metabelian, 357 

metacyclic, 355 

nilpotent, 404 

nongenerators of, 407 
normalizer, 359 

octic, 113 

of motions, 79, 117 

of units, 79, 165 

p-group, 360 

permutation, 79 

polycyclic, 401 

Priifer, 449 


528 Index 


group (Continued) 
product of subgroups, 125 
product of subsets, 350 
projective special linear, 398 
quaternion, 127 
relations in, 78 
Schur’s theorem, 381 
Schur-Zassenhaus theorem, 382 
second isomorphism theorem, 350 
semidirect product, 379 
simple, 128, 398 
simple factor, 355 
solvable, 395, 397, 436 
special linear, 86, 138 
subgroup of, 86 
Sylow theorems, 371 
symmetric, 54, 56, 78, 94 
third isomorphism theorem, 355 
torsion, 136 
torsion-free, 136 
translations of, 357 

group action, 365 
by conjugation, 366 
by multiplication, 366 
faithful, 370 
fixed element, 367 
fixed subset, 368 
fixer of, 367 
G-morphisms, 371 
orbit decomposition theorem, 368 
orbit in, 367 
stabilizer, 368 
transitive, 370 
trivial, 366 

group of units, 165 


Holder, O. (1859-1937), 390 
Hall’s theorem, 399 
Hall, P. (1904-1982), 399 
Hall, P. (1904-1982), 399 
Hamilton, W.R. (1805-1865), 115, 159, 
173, 447 
Hamming, R. (1915-1998), 143, 146 
(7,4)-code, 151 
bound, 148 
code, 156 
distance, 146 
weight, 146 
Hankel, H. (1839-1873), 412 
Hardy, G.H. (1877-1947), 364 
Hermite, C. (1822-1901), 283 
Hertz, H.R. (1857-1894), 202 


higher derived subgroups, 396 
Hilbert, D. (1862-1943), 5, 159, 196 
Hobbes, T. (1588-1679), 304 
homomorphism, 100 

fixed element of, 294 

general ring, 189 

group, 99 

image of, 100, 137, 192, 326 

kernel of, 137, 192, 326 

module, 326 

preimage of, 141 

ring, 189 

trivial, 99, 138 
Hopkins-Levitzky theorem, 450 
hypothesis, 2 


ideal, 181 
annihilator, 187, 314 
left, 187, 488 
left/right, 326 
maximal, 184 
maximal left, 488 
prime, 182 
principal, 181 
proper, 181 
zero, 181 

idempotent, 75, 165, 453 
lifting, 468 

identity 
homomorphism, 326 
mapping, 13 
permutation, 55 


image of a homomorphism, 100, 137, 192, 


326 

image of a mapping, 12 
implication, 2 
inclusion mapping, 137 
indeterminant, 203 
index of a subgroup, 111 
induction 

definition by, 29 

hypothesis, 24 

mathematical, 24 

principal of, 24 

strong, 28 
inductive definition, 29 
inductive set, 457 
inner automorphism, 105, 140, 168 
integers, 5 

prime, 36 

relatively prime, 36 
integers modulo n, 44, 45, 47, 49 


integral domain, 172 
ACCP, 255 
as factor ring, 183 
associates in, 253 
Bézout, 272 
embedding theorem, 177 
euclidean, 270 
field of quotients, 177 
greatest commion divisor (gcd), 258 
irrreducible in, 254 
least common multiple (lcm), 258 
ordered, 199 
positive elements, 199 
prime element of, 257 
prime ideal of, 488 
principal ideal domain (PID), 264 
reducible in, 254 


unique factorization domain (UFD), 256 


well-ordered, 200 


intermediate field of an extension, 425 


internal direct sum, 329 
intersection of sets, 7 
invariant basis number (IBN), 332 
invariant factors, 346 
inverse 

in a monoid, 73 

left/right, 76 

of a complex number, 473 

permutation, 56 
inverse of a mapping, 14 
invertibility theorem, 15 
ireducible, 215 
irreducible, 254 

polynomial, 215 
isometry, 119 
isomorphic groups, 82, 102 
isomorphic rings, 167 
isomorphism, 102 

of rings, 167 
isomorphism theorem 

group, 138 

module, 327 

ring, 192 

second group, 350 

second ring, 198 

third group, 355 

third ring, 198 


Jacobson radical, 467 

Jacobson’s theorem, 444 

Jacobson, N. (1910-1999), 444, 467 
Jordan, C. (1832-1922), 398 


Index 


Jordan, C. (1838-1922), 390 

Jordan—Hélder theorem 
group, 390 

Jordan-Hoélder theorem, 461 
module, 451 


Kaplansky, I. (1917-2006 ), 426 


529 


kernel of a homomorphism, 137, 192, 326 


Klein group, 83 

Klein, F. (1849-1925), 83 
Kronecker delta, 486 
Kronecker’s theorem, 233, 291 


Kronecker, L. (1823-1891), 23, 233, 271, 


291, 296, 346 


Kummer, E.E. (1810-1893), 181, 251, 271, 


297 


Lagrange 
four square identity, 175 
interpolation, 214 
polynomials, 214 


Lagrange’s theorem, 109, 111, 114, 115 


Lagrange, J.L. (1736-1813), 108, 111, 115, 


175, 202, 214, 413, 432, 435 
Laplace, P.S. de (1749-1827), 483 
lattice diagram, 87 
lem, 39, 258 
leading coefficient, 205 
least common multiple, 39, 258 
left ideal, 326 
Legendre, A.-M. (1752-1833), 432 
Lindemann, F, (1852-1939), 283, 307 
linear combination, 33, 277 

trivial, 277 
linear polynomial, 205 
Lobachevski, N.I. (1793-1865), 310 
localization, 188 
logically equivalent, 3 


Mobius function, 446 
Mobius inversion formula, 446 
main theorem of Galois theory, 429 
Mal’cev, A.I. (1909-1967), 177 
mapping, 10, 15 

action of, 10 

bijective, 11 

composite, 12 

constant, 16 

domain, codomain, 10 

fundamental identities, 14 

identity, 13 

image, 10 


530 Index 


mapping (Continued) 
image of, 12 
inverse of, 14 
natural, 19 
one-to-one, 11 
onto, 11 
structure preserving, 99, 189, 190 
surjective, 11 
well-defined, 10 
matrix, 161, 162, 479 
identity, 162, 481 
m by n, 479 
main diagonal, 162, 481 
operations, 162 
parity check, 154 
product, 162, 480 
ring, 162 
similarity, 168 
square, 479 
zero, 162, 479 
matrix rings, 162 
ideals in, 185 
matrix units, 184 
Maurolico, Francesco (1494-1575), 24 
maximal 
subgroup, 405 
maximal ideal, 184 
maximal normal subgroup, 355 
McKay, J.H., 369 
metacyclic group, 355 
minimal polynomial over F, 285, 414 
modular irreducibility test, 220 
modular law 
for subgroups, 350 
module, 324, 325 
annihilator in, 327 
artinian, 448 
basis of, 330, 462 
complemented, 459 
composition series, length, factors, 451 
direct sum, external, 325 
direct sum, internal, 329 
endomorphism ring, 452 
factor modules, 327 
finite dimensional, 456 
finitely generated, 330 
free, 331, 462 
generating set, 330, 462 
homomorphism, 326 
indecomposable, 456 
independent set, 462 
independent subset, 330 


invariant basis number (IBN), 332 
isomorphic, 326 
isomorphism, 326 
isomorphism theorem, 327 
left, 325 
morphism, 326 
noetherian, 448 
over a PID, 335 
principal, 326 
projective, 332, 464 
rank, 333 
rank theorem, 332 
semisimple, 448, 458, 460, 462 
simple, 334, 450 
submodule, 326 
sum, 328 
torsion-free, 327 
trivial morphism, 326 
modules over a PID, 335 
annihilators in, 337 
decomposition of p-modules, 339 
free if and only if torsion-free, 335 
fundamental theorem, 345 
order of elements, 336 
p-modules, 339 
p-primary component, 337 
primary decomposition theorem, 337 
submodule theorem, 343 
torsion submodule, 336 
modulus, 43 
modulus of a complex number, 473 
monic polynomial, 205, 215, 219, 221-223 
monoid, 70 
commutative, 70 
monomial, 240 
morphism 
identity, 326 
imagel of, 326 
kernel of, 326 
module, 326 
motion of a figure, 117 
multiplication rule for complex numbers, 
475 
multiplication theorem, 287 
multiplicative notation, 71 


natural mapping, 19 
natural numbers, 5 
negative, 45, 73, 160 
Newton identities, 244 
Newton, I. (1642-1727), 224 
nil radical, 188 


nilpotency class, 404 
nilpotent, 166, 488 
ideal, 465 
nilpotent group, 404 
Burnside-Wielandt theorem, 405 
Noether, E. (1882-1935), 159, 196, 448 
noetherian rings, 196 ; 
left, 450 
nongenerator in a group, 407 
norm in quadratic integers, 268 
normal closure, 131 
normal subgroup, 122, 131 
test for, 123 
normalizer, 359 
normalizer of a subgroup, 130 
number 
of elements in a set, 12 


octic group, 113 
one-to-one mapping, 11 
onto mapping, 11 
orbit, 367 
orbit decomposition theorem, 367, 
368 
order 
of a group, 77 
order of an element 
finite, 92 
in a module over a PID, 336 
infinite, 92 
ordered integral domain, 199 
ordered n-tuple, 8 
component of, 8 
ordered pair, 7 
Ore domains, 177 
Ore, O. (1899-1968), 177 
Oresme, Nicole (1323-1382), 8 
orthogonality theorem, 155 


p-group, 360 
finite, 339 
finite abelian, 338 
p-module, 339 
direct sum decomposition, 339 
elementary divisors, 340 
type, 341 
uniqueness of decomposition, 339 
parity, 61 
parity check matrix, 154, 315 
parity theorem, 61, 63 
partial fraction expansion, 237 
partial order, 486 


Index 531 


partially ordered set (poset), 486 
inductive, 487 
upper bound in, 487 
partition, 18 
cells of, 18 
singleton, 19 
theorem, 19 
trivial, 19 
Pascal, B. (1623-1662), 27, 30 
identity, 26 
triangle, 27 
Peano axioms, 29 
Peano, Giuseppe. (1858-1932), 29 
Pell’s equation, 269 
permutation, 54 
cycle, 58 
cycle index, 385 
disjoint, 57, 58, 60 
even, 61 
fixed/moved element of, 57 
identity, 55 
inverse, 56 
odd, 61 
of a set, 79 
sign of, 138 
permutation group, 79 
PID, 264, 335, 343 
and UFD’s, 265 
module decomposition, 335 
module is free iff torsion-free, 335 
primes in, 266 
PID (principal ideal domain), 264, 335 
pigeonhole principle, 4 
Poincaré, H. (1854-1912), 99, 117, 447 
pointwise operations, 161 
Poisson, S.-D.(1781-1840), 432 
polycyclic group, 401 
polynomial, 203, 240 
alternating, 245 
coefficients, 203 
constant, 204 
constant coefficient, 204 
cubic, 205 
cyclotomic, 221, 442, 445 
degree, 205, 240 
derivative of, 299 
Eisenstein criterion, 220 
elementary symmetric, 242 
equality, 204 
error-locator, 321 
evaluation theorem, 208 
even/odd, 282 


532 Index 


polynomial (Continued) 
factor is a field, 232 
factor rings, 230 
factor theorem, 209 
fundamental theorem of symmetric 
polynomials, 242, 309 
ged, 222 
homogeneous, 240 
homogeneous components, 240 
irreducible, 215 
Kronecker’s theorem, 233 
Lagrange, 214 
leading coefficient, 205 
least common multiple, 235 
lexicographic order, 241 
linear, 205 
minimal, 285, 414 
modular irreducibility test, 220 
monic, 205, 215, 219, 221-223 
negative of, 204 
Newton identities, 244 
partial fraction expansion, 237 
primitive, 260 
principle ideal domain, 227 
proper factorization, 218 
quadratic, 205 
quartic, 205 
quintic, 205 
rational forms, 237, 440 
rational roots theorem, 211 
reducible, 215 
relatively prime, 222, 235 
remainder theorem, 209 
repeated root, 300 
ring, 203 
roots of, 210 
separable, 418 
several variables, 239 
solvable, 435 
splits, 292 
symmetric, 241, 309, 439 
symmetric rational forms, 
440 
unique factorization theorem, 
223 
zero, 204 
positive elements, 199 
positive numbers, 6 
power of an element, 72 
Priifer group, 449 
preimage, 141 
primary component, 337, 372 


primary decomposition theorem 
for finite abelian groups, 339 
for modules over a PID, 337 
prime, 32, 36, 37, 40 
in a PID, 337 
prime factorization, 36 
theorem, 37 
prime fields, 194 
prime ideal, 182 
prime number, 3 
prime power, 38 
prime ring, 469 
primitive element, 302 
primitive element theorem, 288, 419 
primitive polynomial, 260 
primitive root modulo p, 302 
primitive roots of unity, 302, 437, 476 
principal ideal, 181 
projection, 107, 334 
projections, 332 
projective module, 332, 464 
projective special linear group, 398 
proof, 2 ; 
by contradiction, 2 
direct method, 2 
reduction to cases, 2 
proper 
ideal, 181 
subgroup, 86 
submodule, 459 


quadratic 

integers, 266, 267 
quadratic formula, 202, 217, 397, 434 
quadratic polynomial, 205 
quartic polynomial, 205 
quaternion group, 127 
quaternions, 174, 179 

conjugate, 174 

norm, 174 
quintic polynomial, 205 
quotient, 32 
quotient set, 19 


radian measure of an angle, 474 
radical extension, 435 

rank of a module, 333 

rank theorem for modules, 332 
rational expression, 190 
rational forms, 237, 440 
rational numbers, 5 

rational roots theorem, 211 


real numbers, 6 
recursion theorem, 29, 490 
reducible, 254 
in an integral domain, 254 
reducible polynomial, 215 
regular n-gon, 120 
regular representation, 457 
relation, 17 
relatively prime, 36, 37 
integers, 36 
polynomials, 222 
remainder, 32 
remainder theorem, 209 
residue class, 43 
residue modulo n, 43 
ring, 160 
automorphism, 167 
binomial theorem, 163 
boolean, 170 
center, 165 
characteristic of, 163 


Chinese remainder theorem, 195 


commutative, 160 
corner ring, 165, 454 
decomposition theorem, 195 
density theorem, 467 
direct product, 161 
division, 166, 444 
endomorphism, 452 
factor, 181 

general, 160, 194 

group of units, 165 
homomorphism, 189 
ideal of, 181 
idempotent in, 165, 453 
isomorphism, 167 
isomorphism theorem, 192 
Jacobson radical, 467 
left artinian, 450 

left noetherian, 450 
lifting idempotents, 468 
local, 188 

maximal left ideal, 488 
negative in, 160 

nil radical of, 188, 488 
nilpotent ideal, 465 
nilpotent in, 166, 488 
noetherian, 196 

of functions, 161 

of matrices, 161 
opposite, 169 
polynomial, 203 


prime, 469 
prime ideal of, 488 
semiperfect, 468 
semisimple, 467 
simple, 183-185 
subring, 164 
subtraction, 163 
unit, 165 
unity of, 160 
upper triangular, 164 
zero ring, 161 

root 
multiplicity, 210 
of a polynomial, 210 
of unity, 77, 302, 437, 476 
rational, 211 ~ 
repeated, 300 


Index 


Ruffini, P. (1765-1822), 397, 413 


Russell, B. (1872-1970), 9, 159, 489 


Schur’s lemma, 334, 454 
Schur’s theorem, 381 


Schur, I. (1875-1941), 381, 454 
Schur-Zassenhaus theorem, 382 


second isomorphism theorem 


for groups, 350 

for rings, 198 
semidirect product, 379, 380 
semiperfect ring, 468 


semisimple module, 448, 458, 460, 462 


homogeneous, 462 


homogeneous component, 461 


semisimple rings, 467 
separable closure, 422 
separable extension, 418 
sequence, 29, 203, 248, 490 

recursively defined, 490 
set, 5 

cartesian product, 8 

containment, 5 

difference, 7 

disjoint, 18 

element of, 5 

empty, 6 

equality, 5 

infinite, 6 

intersection, 7 

operations on, 7 

proiper containment, 5 

singleton, 6 

subset of, 5 

union, 7 


533 


534 Index 


Shannon, C.E. (1916-2001), 143, 
394 
sign of a permutation, 66, 138 
similar matrics, 168 
simple 
group, 128, 398 
module, 334, 450 
ring, 183-185 
solvable group, series, 395 
solvable polynomial, 4385, 436 
span of vectors, 277 
special linear group, 86, 138 
projective, 398 
Split mapping, 463 
splitting field, 292, 418 
existence, 292 
uniqueness, 294 
square free, 38 
stabilizer, 368 
Steinitz exchange lemma, 279 
subfield, 283 
subgroup, 86 
characteristic, 124, 130 
conjugate, 88, 123, 358 
cyclic, 91 
derived, 134, 396 
Frattini, 406 
generated by a set, 96 
generators, 89 
index of, 111 
maximal, 405 
maximal normal, 355 
normal, 88, 122, 131 
proper, 86 
self-conjugate, 88 
subnormal, 410 
test, 86 
torsion, 136 
trivial, 86 
unconnected set of, 352 
subgroup test, 86 
submodule, 326 
direct sum, 328 
fully invariant, 461 
maximal, 459 
principal, 326 
proper, 459 
sum of, 328 
submodule theorem, 343 
subset, 5 
subspace test, 276 
surjective mapping, 11 


Sylow p-subgroup, 372 

number of, 374 
Sylow theorems, 371 
Sylow’s first theorem, 372 
Sylow’s second theorem, 373 
Sylow’s third theorem, 374 
Sylow, L. (1832-1918), 371, 377 
symmetric group, 54, 56, 78, 94 


symmetric polynomial, 241, 309, 439 


elementary, 309, 439 
symmetric rational forms, 440 
symmetry of a figure, 119 
syndrome, 154 


Tartaglia, N. (1500-1557), 412, 435 
theorems, 3 
third isomorphism theorem 

for groups, 355 

for rings, 198 
Thompson, J.G. (1932-), 129, 399 
torsion group, 136 
torsion subgroup, 136 
torsion submodule, 334, 336 
torsion-free 

element, 327 

module, 327, 334 
torsion-free group, 136 
transcendental element, 283 
transposition, 60, 61 
trisecting an angle, 304, 307 
trivial factorization, 215, 252 
trivial homomorphism, 99, 138, 326 
trivial linear combination, 277 
trivial subgroup, 86 
Tucker, A., 143 


UFD, 256, 265 
characterization, 259 
Gauss’ lemma, 261 
polynomials, 261 
unconnected subgroups, 352 
union of sets, 7 
unique factorization theorem, 223 
unit, 73, 165 
circle, 474 
in a monoid, 79 
properties, 74 
unit circle, 474 
unity, 70 
left/right, 75 
unity for a binary operation, 70 
upper triangular, 164 


variety of groups, 401 
vector space, 275 
basis of, 278, 458, 487 
dependence in, 277 
dimension, 279 
dimension theorem, 282 
finite dimensional, 277 
fundamental theorem, 278 
invariance theorem, 279 
linear combination, 277 
linear independence, 277, 458, 487 
linear transformation, 282 
scalar multiples, 275 
spanning set, 277, 458, 487 
subspace, 276 
zero space, 276 
Venn diagrams, 7 
Venn, J. (1834-1883), 7 
Voltaire-F.M.A. (1694-1778), 274 


Wedderburn’s theorem, 444 
on division rings, 173 
on simple rings, 455 
Wedderburn, J.H.M. (1882-1948), 159, 
173, 442, 443, 447, 455, 466 
Wedderburn-Artin Theorem, 457 


Index 


Weierstrass, K. (1815-1897), 297 
Weil, A. (1906~1998), 388 
well-defined mapping, 10, 15, 131 
well-ordered integral domain, 200 
well-ordering 

axiom, 28 
Weyl, H. (1885-1955), 196, 432, 455 
Whitehead, A.N. (1861-1947), 1, 159 
Wielandt, H. (1910-2001), 377, 405 
Wiles, A., 51 
Wilson’s theorem, 49, 97 
Witt, E., (1911-1991), 443 
word 

empty, 75 

justaposition, 75 

length, 75 


Zassenhaus, H. (1912-1991), 381, 
394 

Zermelo, E. (1871-1953), 489 
Zero 

ideal, 181 

matrix, 162 

vector space, 276 
Zorn’s lemma, 333, 457, 487 
Zorn, M. (1906-1993), 487 


535 


